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Preface 


The purpose of this book is to provide an introduction to principles of 
probability, random variables, and random processes and their applications. 


The book is designed for students in various disciplines of engineering, 
science, mathematics, and management. It may be used as a textbook and/or as 
a supplement to all current comparable texts. It should also be useful to those 
interested in the field for self-study. The book combines the advantages of both 
the textbook and the so-called review book. It provides the textual explanations 
of the textbook, and in the direct way characteristic of the review book, it gives 
hundreds of completely solved problems that use essential theory and 
techniques. Moreover, the solved problems are an integral part of the text. The 
background required to study the book is one year of calculus, elementary 
differential equations, matrix analysis, and some signal and system theory, 
including Fourier transforms. 


I wish to thank Dr. Gordon Silverman for his invaluable suggestions and 
critical review of the manuscript. I also wish to express my appreciation to the 
editorial staff of the McGraw-Hill Schaum Series for their care, cooperation, 
and attention devoted to the preparation of the book. Finally, I thank my wife, 
Daisy, for her patience and encouragement. 


HWEI P. HSU 
MONTVILLE, NEW JERSEY 


Contents 


Chapter 1. Probability 


1.1 Introduction 

1.2 Sample Space and Events 

1.3 Algebra of Sets 

1.4 The Notion and Axioms of Probability 
1.5 Equally Likely Events 

1.6 Conditional Probability 

1.7 Total Probability 

1.8 Independent Events 

Solved Problems 


Chapter 2. Random Variables 


2.1 Introduction 

2.2 Random Variables 

2.3 Distribution Functions 

2.4 Discrete Random Variables and Probability Mass Functions 

2.5 Continuous Random Variables and Probability Density Functions 
2.6 Mean and Variance 

2.7 Some Special Distributions 

2.8 Conditional Distributions 

Solved Problems 


Chapter 3. Multiple Random Variables 


3.1 Introduction 

3.2 Bivariate Random Variables 

3.3 Joint Distribution Functions 

3.4 Discrete Random Variables - Joint Probability Mass Functions 


3.5 Continuous Random Variables - Joint Probability Density Functions 


3.6 Conditional Distributions 

3.7 Covariance and Correlation Coefficient 

3.8 Conditional Means and Conditional Variances 
3.9 N-Variate Random Variables 

3.10 Special Distributions 

Solved Problems 


_— 


OmMWOoONNANNNR Fe 


vi 


Chapter 4. Functions of Random Variables, Expectation, Limit Theorems 122 


4.1 Introduction 122 
4.2 Functions of One Random Variable 122 
4.3 Functions of Two Random Variables 123 
4.4 Functions of n Random Variables 124 
4.5 Expectation 125 
4.6 Moment Generating Functions 126 
4.7 Characteristic Functions 127 
4.8 The Laws of Large Numbers and the Central Limit Theorem 128 
Solved Problems 129 
Chapter 5. Random Processes 161 
5.1 Introduction 161 
5.2 Random Processes 161 
5.3 Characterization of Random Processes 161 
5.4 Classification of Random Processes 162 
5.5 Discrete-Parameter Markov Chains 165 
5.6 Poisson Processes 169 
5.7 Wiener Processes 172 
Solved Problems 172 
Chapter 6. Analysis and Processing of Random Processes 209 
6.1 Introduction 209 
6.2 Continuity, Differentiation, Integration 209 
6.3 Power Spectral Densities 210 
6.4 White Noise 213 
6.5 Response of Linear Systems to Random Inputs 213 
6.6 Fourier Series and Karhunen-Loéve Expansions 216 
6.7 Fourier Transform of Random Processes 218 
Solved Problems 219 
Chapter 7. Estimation Theory 247 
7.1 Introduction 247 
7.2 Parameter Estimation 247 
7.3 Properties of Point Estimators 247 
7.4 Maximum-Likelihood Estimation 248 
7.5 Bayes' Estimation 248 
7.6 Mean Square Estimation 249 
7.7 Linear Mean Square Estimation 249 


Solved Problems 250 


Vii 


Chapter 8. Decision Theory 


8.1 Introduction 

8.2 Hypothesis Testing 
8.3 Decision Tests 
Solved Problems 


Chapter 9. Queueing Theory 


9.1 Introduction 

9.2 Queueing Systems 

9.3 Birth-Death Process 

9.4 The M/M/1 Queueing System 
9.5 The M/M/s Queueing System 
9.6 The M/M/1/K Queueing System 
9.7 The M/M/s/K Queueing System 
Solved Problems 


Appendix A. Normal Distribution 
Appendix B. Fourier Transform 


B.1 Continuous-Time Fourier Transform 
B.2 Discrete-Time Fourier Transform 


Index 


264 


264 
264 
265 
268 


281 


281 
281 
282 
283 
284 
285 
285 
286 


297 


299 


299 
300 


303 


Chapter 1 


Probability 


11 INTRODUCTION 


The study of probability stems from the analysis of certain games of chance, and it has found 
applications in most branches of science and engineering. In this chapter the basic concepts of prob- 
ability theory are presented. 


12 SAMPLE SPACE AND EVENTS 
A. Random Experiments: 


In the study of probability, any process of observation is referred to as an experiment. The results 
of an observation are called the outcomes of the experiment. An experiment is called a random experi- 
ment if its outcome cannot be predicted. Typical examples of a random experiment are the roll of a 
die, the toss of a coin, drawing a card from a deck, or selecting a message signal for transmission from 
several messages. 


B. Sample Space: 

The set of all possible outcomes of a random experiment is called the sample space (or universal 
set), and it is denoted by S. An element in S is called a sample point. Each outcome of a random 
experiment corresponds to a sample point. 

EXAMPLE 1.1 Find the sample space for the experiment of tossing a coin (a) once and (6) twice. 
(a) There are two possible outcomes, heads or tails. Thus 
S = {H, T} 
where H and T represent head and tail, respectively. 
(b) There are four possible outcomes. They are pairs of heads and tails. Thus 
S = {HH, HT, TH, TT} 
EXAMPLE 1.2 Find the sample space for the experiment of tossing a coin repeatedly and of counting the number 
of tosses required until the first head appears. 
Clearly all possible outcomes for this experiment are the terms of the sequence 1, 2, 3, .... Thus 
S = {1, 2, 3,...} 


Note that there are an infinite number of outcomes. 


EXAMPLE 1.3 Find the sample space for the experiment of measuring (in hours) the lifetime of a transistor. 
Clearly all possible outcomes are all nonnegative real numbers. That is, 
S={:0<1< w} 


where t represents the life of a transistor in hours. 


Note that any particular experiment can often have many different sample spaces depending on the observ- 
ation of interest (Probs. 1.1 and 1.2). A sample space S is said to be discrete if it consists of a finite number of 
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sample points (as in Example 1.1) or countably infinite sample points (as in Example 1.2). A set is called countable 
if its elements can be placed in a one-to-one correspondence with the positive integers. A sample space S is said 
to be continuous if the sample points constitute a continuum (as in Example 1.3). 


C. Events: 


Since we have identified a sample space S as the set of all possible outcomes of a random experi- 
ment, we will review some set notations in the following. 
If € is an element of S (or belongs to S), then we write 


CeS 
If S is not an element of S (or does not belong to S), then we write 


c¢S 
A set A is called a subset of B, denoted by . 
AcB 


if every element of A is also an element of B. Any subset of the sample space S is called an event. A 
sample point of S is often referred to as an elementary event. Note that the sample space S is the 
subset of itself, that is, S < S. Since S is the set of all possible outcomes, it is often called the certain 
event. 


EXAMPLE 1.4 Consider the experiment of Example 1.2. Let A be the event that the number of tosses required 
until the first head appears is even. Let B be the event that the number of tosses required until the first head 
appears is odd. Let C be the event that the number of tosses required until the first head appears is less than 5. 
Express events A, B, and C. 


A= (2,4,6,...} 
B= (1, 3,5,..3 
C = {1, 2,3, 4} 


1.3 ALGEBRA OF SETS 
A. Set Operations: 
I. Equality: 
Two sets A and B are equal, denoted A = B, if and only if Ac Band Bc A. 
2. Complementation: 


Suppose A < S. The complement of set A, denoted A, is the set containing all elements in S but 
not in A. 


A = {€:€e Sand ¢ ¢ A} 
3. Union: 


The union of sets A and B, denoted A vu B, is the set containing all elements in either A or B or 
both. 


AUB={l:€eAorle B} 
4. Intersection: 


The intersection of sets A and B, denoted A - B, is the set containing all elements in both A 
and B. 


An B={f:6eA and Ce B} 
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5. Null Set: 
The set containing no element is called the null set, denoted @. Note that 
@=S§ 
6. Disjoint Sets: 


Two sets A and B are called disjoint or mutually exclusive if they contain no common element, 
thatis,ifA A B= @, 

The definitions of the union and intersection of two sets can be extended to any finite number of 
sets as follows: 


UA, = A, VU AQ U- VA, 
i=1 


t 


={€:Ce€A, or€ eA, or: CeEA,} 


A, =A, AN AZ N°: NA, 
Aa 


= {€:Ce€ A, and (€¢€ A, and---Ce€A,} 


Note that these definitions can be extended to an infinite number of sets: 


De Ice 


al 


In our definition of event, we state that every subset of S is an event, including S and the null set 
@. Then 


S = the certain event 
@ = the impossible event 


If A and B are events in S, then 


A = the event that A did not occur 
A vu B= the event that either A or B or both occurred 
A oO B= the event that both A and B occurred 


Similarly, if A,, A2,..., A, are a sequence of events in S, then 


n 
\) A; = the event that at least one of the A; occurred; 


a 


() A; = the event that all of the A; occurred. 


r= 


B. Venn Diagram: 


A graphical representation that is very useful for illustrating set operation is the Venn diagram. 
For instance, in the three Venn diagrams shown in Fig. 1-1, the shaded areas represent, respectively, 
the events A U B, Ac B, and A. The Venn diagram in Fig. 1-2 indicates that B < A and the event 
A © Bis shown as the shaded area. 


4 PROBABILITY 


(a) Shaded region: AU B (b) Shaded region: A 4 B 


(¢) Shaded region: A 


Fig. 1-1 


BoA 


Shaded region: Am B 


Fig. 1-2 


C. Identities: 


By the above set definitions or reference to Fig. !-1, we obtain the following identities: 


The union and intersection operations also satisfy the following laws: 


Commutative Laws: 
AUB=BUA 
ANB=BOA 


Associative Laws: 
AV(BUQ=(AUBUC 
AN(BAQC)=(AN BAC 
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(1.1) 
(1.2) 
(1.3) 
(1.4) 
(1.5) 
(1.6) 
(1.7) 


(1.8) 
(1.9) 


(1.10) 
(1.11) 
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Distributive Laws: 


A(BUC)=(ANBYU(ANG (1.12) 
AV(BONC)=(AV BY n(AUC) (1.13) 
De Morgan’s Laws: 
AUB=ANB (1.14) 
ANB=AvUB (1.15) 


These relations are verified by showing that any element that is contained in the set on the left side of 
the equality sign is also contained in the set on the right side, and vice versa. One way of showing this 
is by means of a Venn diagram (Prob. 1.13). The distributive laws can be extended as follows: 


ao(s = Uo 8) (1.16) 
Av (A a) = fae UB) (1.17) 


(W4)= (\4, (1.18) 


(Wa)= fay (1.19) 


1.4 THE NOTION AND AXIOMS OF PROBABILITY 


An assignment of real numbers to the events defined in a sample space S is known as the prob- 
ability measure. Consider a random experiment with a sample space S, and let A be a particular event 
defined in S. 


A. Relative Frequency Definition: 


Suppose that the random experiment is repeated n times. If event A occurs n(A) times, then the 
probability of event A, denoted P(A), is defined as 


P(A) = lim —— (1.20) 


nao 


where n(A)/n is called the relative frequency of event A. Note that this limit may not exist, and in 
addition, there are many situations in which the concepts of repeatability may not be valid. It is clear 
that for any event A, the relative frequency of A will have the following properties: 


1. O<n(A)/n < 1, where n(A)/n = 0 if A occurs in none of the n repeated trials and n(A)/n = 1 if A 
occurs in all of the n repeated trials. 


2. If A and B are mutually exclusive events, then 


n(A UV B) = n(A) + n(B) 
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and 


B. Axiomatic Definition: 


Let S be a finite sample space and A be an event in S. Then in the axiomatic definition, the 
probability P(A) of the event A is a real number assigned to A which satisfies the following three 
axioms: 


Axiom 1: P(A) 20 (1.21) 
Axiom 2: P(S)=1 (1.22) 
Axiom 3: P(A U B) = P(A) + P(B) fANB=2 (1.23) 


If the sample space S is not finite, then axiom 3 must be modified as follows: 


Axiom 3’: If A,, Az, ... 18 an infinite sequence of mutually exclusive events in S (A; Nn A; = @ 
fori #j), then 


aU A.) = y P(A)) (1.24) 


These axioms satisfy our intuitive notion of probability measure obtained from the notion of relative 
frequency. 


C. Elementary Properties of Probability: 


By using the above axioms, the following useful properties of probability can be obtained: 


1. P(A) = 1— P(A) (1.25) 
2. P(O)=0 (1.26) 
3. P(A) < P(B) ifAcB (1.27) 
4. P(A)<1 (1.28) 
5. P(A U B)= P(A) + P(B) — P(A 1 B) (1.29) 
6. IfA,, A2,...,A, are n arbitrary events in S, then 
o(Ua,)= Y P(A) — Y P(A, A) + Y P(A, A; 0 Ad 
i=1 i=1 p4j be jee 
— + (-1) P(A, A AYN NA,) (1.30) 
where the sum of the second term is over all distinct pairs of events, that of the third term is over 
all distinct triples of events, and so forth. 
7. If Ay, Az,..., A, is a finite sequence of mutually exclusive events in S (A; 7 A; = @ fori #/), 


then 


of U A.) = ¥ P(A) (1.31) 
i=l i=] 
and a similar equality holds for any subcollection of the events. 


Note that property 4 can be easily derived from axiom 2 and property 3. Since A < S, we have 
P(A) < P(S) = 1 
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Thus, combining with axiom 1, we obtain 
0 < P(A) <1 (1.32) 
Property 5 implies that 
P(A U B) < P(A) + P(B) (1.33) 
since P(A ~ B) > 0 by axiom I. 


15 EQUALLY LIKELY EVENTS 
A. Finite Sample Space: 
Consider a finite sample space S with n finite elements 
S = {01,025 --05 Sah 


where (,’s are elementary events. Let P(¢;) = p;. Then 


1 O<psl i=1,2,...,n (1,34) 
2. Lar, + Py +co +p, = 1 (1.35) 
3. If A= Us, where J is a collection of subscripts, then 

P(A) = PO = pa (1.36) 


B. Equally Likely Events: 
When all elementary events ¢, (i = 1, 2, ..., n) are equally likely, that is, 
Py =P2=""°=Pa 
then from Eq. (1.35), we have 


1 
pi =- i=1,2,...,n ; (1.37) 

n 
and P(A) = —— (1.38) 
where n(A) is the number of outcomes belonging to event A and n is the number of sample points 


in S. 


16 CONDITIONAL PROBABILITY 
A. Definition: 

The conditional probability of an event A given event B, denoted by P(A | B), is defined as 
P(A 4 B) 


P(A|B) = PB) P(B) >0 (1.39) 
where P(A - B) is the joint probability of A and B. Similarly, 
B 
pp Ay=P A 2) payso (1.40) 


P(A) 


8 PROBABILITY [CHAP 1 


is the conditional probability of an event B given event A. From Eggs. (1.39) and (1.40), we have 
P(A © B) = P(A|B)P(B) = P(B| A)P(A) (1.41) 
Equation (1.41) is often quite useful in computing the joint probability of events. 
B. Bayes’ Rule: 


From Eq. (1.41) we can obtain the following Bayes’ rule: 
P(B{| A)P(A) 


P(A|B) = P(B) (1.42) 
1.7 TOTAL PROBABILITY 
The events A,, A,,..., A, are called mutually exclusive and exhaustive if 
YAR ALL Ap uv A= and A, 0 A; = @B i xj (1.43) 
Let B be any event in S. Then 
P(B) = ye oO A)= 5 PBI ANP(AY) (1.44) 


which is known as the total probability of event B (Prob. 1.47). Let A = A; in Eq. (1.42); then, using 
Eq. (1.44), we obtain 


P(A, | B) = BADIA 


> P(B| A) P(A) 
i= 


(1.45) 
Note that the terms on the right-hand side are all conditioned on events A;, while the term on the left 


is conditioned on B. Equation (1.45) is sometimes referred to as Bayes’ theorem. 


18 INDEPENDENT EVENTS 


Two events A and B are said to be (statistically) independent if and only if 


P(A «- B) = P(A)P(B) (1.46) 
It follows immediately that if A and B are independent, then by Eqs. (1.39) and (1.40), 
P(A|B) = P(A) and P(B| A) = P(B) (1.47) 


If two events A and B are independent, then it can be shown that A and B are also independent; that 
is (Prob. 1.53), 


P(A 0 B) = P(A)P(B) (1.48) 
Then P(A|B) = 7 = P(A) (1.49) 


Thus, if A is independent of B, then the probability of A’s occurrence is unchanged by information as 
to whether or not B has occurred. Three events A, B, C are said to be independent if and only if 
P(A A BOC) = P(A)P(B)P(C) 
P(A O B) = P(A)P(B) 
P(A a C) = P(A)P(C) 
P(B a C) = P(B)P(C) 


(1.50) 
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We may also extend the definition of independence to more than three events. The events A,, A>,... 
A, are independent if and only if for every subset {A;,, A 
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’ 


i> + Ay} (2 < k < n) Of these events, 


P(A;, 0 Aj, °° 0 Aj) = P(Ai,)P(A,) +> P(Ai) (1.51) 


Finally, we define an infinite sct of events to be independent if and only if every finite subset of these 
events is independent. 

To distinguish between the mutual exclusiveness (or disjointness) and independence of a collec- 
tion of events we summarize as follows: 


1. 


2. 


If {A,;, ? = 1, 2,..., m} is a sequence of mutually exclusive events, then 


( U) Ai) = y P(A) (1.52) 


i=1 i 


If {A,, ? = 1, 2,..., m} is a sequence of independent events, then 


( 0) A) = J] 4) (1.53) 
i=t 


i=1 


and a similar equality holds for any subcollection of the events. 


Solved Problems 


SAMPLE SPACE AND EVENTS 


1.1. 


1.2. 


Consider a random experiment of tossing a coin three times. 


(a) 


(b) 
(a) 


(b) 


Find the sample space S, if we wish to observe the exact sequences of heads and tails 
obtained. 


Find the sample space S, if we wish to observe the number of heads in the three tosses. 
The sampling space S, is given by 
S, = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT} 


where, for example, HTH indicates a head on the first and third throws and a tail on the second 
throw. There are eight sample points in S,. 
The sampling space S, is given by 

S, = {0, 1, 2, 3} 


where, for example, the outcome 2 indicates that two heads were obtained in the three tosses. The 
sample space S, contains four sample points. 


Consider an experiment of drawing two cards at random from a bag containing four cards 
marked with the integers | through 4. 


(a) 


(b) 
(a) 


Find the sample space S, of the experiment if the first card is replaced before the second is 
drawn. 
Find the sample space S, of the experiment if the first card is not replaced. 
The sample space S, contains 16 ordered pairs (i, /), 1 <i<4, 1 <j <4, where the first number 
indicates the first number drawn. Thus, 
(14,1) (4,2) (1.3) , 4) 
— JQ 4 (2,2) (2.3) (2,4) 
‘13,0 G2) 3,3) (4) 
(4,1) (4,2) (4,3) (4,4) 
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1.3. 


1.4. 


1.5, 


(b) 
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The sample space S, contains 12 ordered pairs (i, /), i #j, | <i <4, 1<j <4, where the first number 
indicates the first number drawn. Thus, 


(1,2) (1,3) (1, 4) 
(2,1) (2,3) (2, 4) 
(3, 1) (3,2) (3, 4) 
(4,1) (4,2) (4 3) 


S,= 


An experiment consists of rolling a die until a 6 is obtained. 


(b) 


Find the sample space S, if we are interested in all possibilities. 
Find the sample space S, if we are interested in the number of throws needed to get a 6. 


The sample space S, would be 
S, = {6, 
16, 26, 36, 46, 56, 
116, 126, 136, 146, 156, ...} 
where the first line indicates that a 6 is obtained in one throw, the second line indicates that a 6 is 
obtained in two throws, and so forth. 
In this case, the sample space S, is 


S,=fiz=01,23,..5 


where / is an integer representing the number of throws needed to get a 6. 


Find the sample space for the experiment consisting of measurement of the voltage output v from 
a transducer, the maximum and minimum of which are +5 and —S volts, respectively. 


A suitable sample space for this experiment would be 


S={v:-5S<v<5} 


An experiment consists of tossing two dice. 


() 


(c) 


(d) 


Find the sample space S. 

Find the event A that the sum of the dots on the dice equals 7. 

Find the event B that the sum of the dots on the dice is greater than 10. 

Find the event C that the sum of the dots on the dice 1s greater than 12. 

For this experiment, the sample space S consists of 36 points (Fig. 1-3): 
S={G,j: i,j = 1, 2, 3, 4, 5, 6} 


where i represents the number of dots appearing on one die and j represents the number of dots 
appearing on the other die. 


The event A consists of 6 points (see Fig. 1-3): 
A = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)} 
The event B consists of 3 points (see Fig. 1-3): 
B = {(5, 6), (6, 5), (6, 6)} 


The event C is an impossible event, that is, C = @. 
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1.6. 


1.7. 


1.8. 


5 


£6)" (2,6) (3,6) (4.6) *.(6.6) (6.6)! 
bo 2.85) HS) 8.16.5) 
(1.4) (249-3. 4.44.4) 64) 6,4) 
[ 1.3) 23) BI~4.3~5,3) 63) 
r (1,2) (2.2) (32) (4275. (6.2) 6.2) 
(1) 21) 31) 4 HIP IY 


Lo } 


Fig. 1-3 


An automobile dealer offers vehicles with the following options: 
(a) With or without automatic transmission 


(6) With or without air-conditioning 
(c) With one of two choices of a stereo system 
(d) With one of three exterior colors 


If the sample space consists of the set of all possible vehicle types, what is the number of out- 
comes in the sample space? 


The tree diagram for the different types of vehicles is shown in Fig. 1-4. From Fig. 1-4 we see that the 
number of sample points in S is 2 x 2 x 2 x 3 = 24, 


Transmission Automatic Manual 


Air-conditioning 


Stereo 


Color 


Fig. 1-4 


State every possible event in the sample space S = {a, b, c, d}. 


There are 2* = 16 possible events in S. They are @; {a}, {b}, {c}, {d}; {a, b}, {a,c}, {a, d}, {b,c}, 
{b, a}, {c, a}; {a, b, c}, (a, b, d), (a, c, d}, {b, c, d}; S = {a, b,c, d}. 


How many events are there in a sample space S$ with n elementary events? 


Let S = {s,,5,,..., 5,}. Let Q be the family of all subsets of S. (Q is sometimes referred to as the power 
set of S.) Let S; be the set consisting of two statements, that is, 


S; = {Yes, the s, is in; No, the s; is not in} 
Then Q can be represented as the Cartesian product 


Q=S8, x S,x+-xS, 
= {(8,, 52, ---, 5,2 5; € S; for i = 1, 2,..., n} 
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Since each subset of S can be uniquely characterized by an element in the above Cartesian product, we 
obtain the number of elements in Q by 


n(Q) = n(S,)n(S2) +++ n(S,) = 2” 


where n(S;) = number of elements in S; = 2. 
An alternative way of finding n(Q) is by the following summation: 


"fn n n! 
nO) =D (") ~ 2 te =o! 


The proof that the last sum is equal to 2” is not easy. 


ALGEBRA OF SETS 
1.9. | Consider the experiment of Example 1.2. We define the events 
A = {k: k is odd} 
B=({k:4<k<7} 
C={k:1<k< 10} 
where k is the number of tosses required until the first H (head) appears. Determine the events A, 
B,C, AUB BUC,ANB,ANC, BAC andAnB. 


A = {k: k is even} = (2, 4, 6,...} 
B= (k:k=1,2,30rk>8} 


C = {kik 2 11} 

Au B={k:k is odd or k = 4, 6} 
BUuUC=C 

An B={5,7} 
AnC={1,3,5,7,9} 
BoC=B8B 

An B= {4, 6} 


1.10. The sample space of an experiment is the real line expressed as 
S= {vi -w@ <v< oo} 


(a) Consider the events 


a 
Noo» 
lt ll 
ee 
v= © 
IA IA 
e ¢ 
A A 
Plu Np 


Determine the events 


(b) Consider the events 


& 
I! 
uo 
cs 
_—~ 


IA IA 


ee 
Pe ee 
Nas 


& 
we 
cae || 
“s 
Cc 


& 
ll 
e 
io) 
lA 
NI) 
aa 
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1.11. 


Determine the events 
Uj B; and QA B; 
i=1 
(a) It is clear that 
U4, =(e:0<0<1) 
i=L 


Noting that the A,’s are mutually exclusive, we have 


(b) Noting that B, > B, >-:: > B,>---, we have 


() B, = B, = {v0 <4) and ()B, = {v: v <0} 


i=l f=1 


Consider the switching networks shown in Fig. 1-5. Let A,, A, and A, denote the events that 
the switches s,, s,, and s, are closed, respectively. Let A,, denote the event that there is a closed 
path between terminals a and b, Express A,, in terms of A,, A,, and A, for each of the networks 
shown. 


an S5 s5 5 2 
dO ee tn bb a fe b 
(a) 


sy 54 
yo. 
—__—_»". 
(d) 


Fig. 1-5 
(a) From Fig. 1-5(a), we see that there is a closed path between a and b only if all switches s,, s,, and s, 
are closed. Thus, 
Ay = A, 0 A, Ay 
(b) From Fig. t-5(b), we see that there is a closed path between a and 5 if at least one switch is closed. 
Thus, 
Ay =A,UA,U A, 
(c) From Fig. 1-5(c), we see that there ts a closed path between a and b if s, and either s, or s; are closed. 
Thus, 
Ay = A, (Az vv Ad) 
Using the distributive law (1.12), we have 
Ay = (A, 9 A,) Uv (Ay 2 A;) 
which indicates that there is a closed path between a and b if s, and s, or s, and s, are closed. 


(d) From Fig. 1-5(d), we see that there is a closed path between a and b if either s,; and s, are closed ors, 
is closed. Thus 


Ay = (A, 7 Az) Vv Ay 
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1.13. 


1.14. 


1.15. 
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Verify the distributive law (1.12). 


Letse [A (BU C)]. Then se 4 and se (Bu C). This means either that se A and se B or that 
seéAandseC;thatis,s e(A A B) ors Ee (A A C). Therefore, 


AN(BUC) CTAB VIAN OQ] 


Next, let se (A 7 B)U (A OC). Then se A and se B or se A and se C. Thus s€ A and (se B or 
s €C). Thus, 


LAA BV(AN CI CAN(BUC) 
Thus, by the definition of equality, we have 
AAN(BUC)=(AN B)U(ANC) 


Using a Venn diagram, repeat Prob. 1.12. 


Figure 1-6 shows the sequence of relevant Venn diagrams. Comparing Fig. 1-6(b) and 1-6(e), we con- 
clude that 


AN(BUC)=(ANB)U(ANC) 


(c) Shaded region: Aa B (d} Shaded region AOC 


(e) Shaded region: (A AB) U(A NC) 


Fig. 1-6 


Let A and B be arbitrary events. Show that 4 c Bifand only if dA 0 B= A. 


“If” part: We show that if A A B= A, then Ac B. Let se A. Then s €(A 2+ B), since A= A B. 
Then by the definition of intersection, s € B. Therefore, A c B. 

“Only if” part: We show that if A < B, then A ~ B= A. Note that from the definition of the intersec- 
tion, (A © B) <A. Suppose s € A. If Ac B, then s € B. Sos € A and se B; that is, s € (A 7 B). Therefore, 
it follows that A <(A - B). Hence, A = A o B. This completes the proof. 


Let A be an arbitrary event in S and let @ be the null event. Show that 
(a2) AUG=A (1.54) 
(b) ANDG=BG (1.55) 
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1.16. 


1.17. 


(2) AU @B={s:seAorse G} 
But, by definition, there are no s € @. Thus, 


AuU@={s:seAJ=A 


(6) AN @={s:seA and se O} 
But, since there are no s € @, there cannot be ans such that se A ands e @. Thus, 


AN ZB=B 


Note that Eq. (/.55) shows that @ is mutually exclusive with every other event and including with 
itself. 


Show that the null (or empty) set @ is a subset of every set A. 
From the definition of intersection, it follows that 
(AN BcA and (AN BcB (1.56) 


for any pair of events, whether they are mutually exclusive or not. If A and B are mutually exclusive events, 
that is, A 7 B= @, then by Eq. (1.56) we obtain 


@cA and @cB (1.57) 
Therefore, for any event A, 
Oca (1.58) 


that is, @ is a subset of every set A. 
Verify Eqs. (1.18) and (1.19). 


(a) Suppose first that s ¢ (0 Ai); then s ¢ (0 Ai) 
it i= 


That is, if s is not contained in any of the events 4;, i = 1, 2,..., m, then s is contained in A, for all 
i= 1,2,...,n. Thus 


se (\4; 


Next, we assume that 


Then s is contained in A; for all i = 1, 2,...,, which means that s is not contained in A, for any i = 1, 
2,...,, implying that 


Thus, sé (0 A) 


This proves Eq. (1.18). 
(b) Using Eqs. (1.18) and (1.3), we have 


which is Eq. (1.19). 
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THE NOTION AND AXIOMS OF PROBABILITY 
1.18 Using the axioms of probability, prove Eq. (7.25). 


1.19. 


1.20. 


1.21. 


We have 
S=AUA and AnAz=@ 
Thus, by axioms 2 and 3, it follows that 
P(S) == 1 = P(A) + P(A) 


from which we obtain 


P(A) = | — P(A) 
Verify Eq. (1.26). 
From Eq, (1.25), we have 
P(A) = 1 — P(A) 
Let A = @. Then, by Eq, (1.2), A = @ = S, and by axiom 2 we obtain 
P(Sy= 1 — P(S)=1-1=0 


Verify Eq. (1.27). 


Let A c B. Then from the Venn diagram shown in Fig. !-7, we sce that 


B=AvuU(AB) and An(AnB=aG 


Hence, from axiom 3, 
P(B) = PLA) + P(A 7 B) 
However, by axiom 1, P(A 7B) > 0, Thus, we conclude that 
P(A) < PB) if ACB 


Shaded region: A 7 A 


Fig. 1-7 


Verify Eq. (1.29). 


[CHAP | 


(1.59) 


From the Venn diageam of Fig. 1-8, each of the sets A UW B and B can be represented, respectively, as a 


union of mutually exclusive sets as follows: 


AUB=Au (AN B) and B=(AnN Bu (4 7B) 


Thus, by axiom 3, 
P(A U B) = P(A) + P(A 7B) 
and P(B) = P(A 7 B) + P(A 7 B) 
From Eq. (1.67), we have 
P(4 7 B) = P(B) — P(A 7 B) 
Substituting Eq. (1.62) into Eq. (1.60), we oblain 
P(A & B) = P(A) + P(B)— P(A 7B) 


(1.60) 
(1.61) 


(1.62) 
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AY AY 
‘@:| |-CPD: 
Shaded region: An B Shaded region: ANB 
Fig. 1-8 


1.22. Let P(A) = 0.9 and P(B) = 0.8. Show that P(A 1 B) > 0.7. 
From Eq. (1.29), we have 
P(A ~ B) = P(A) + P(B) — P(A vu B) 
By Eq. (1.32), 0 < P(A U B) < 1. Hence 
P(A ~ B) > P(A) + P(B) — 1 (1.63) 
Substituting the given values of P(A) and P(B) in Eq. (1.63), we get 
P(A sn B)2>094+08-1=07 


Equation (1.63) is known as Bonferroni's inequality. 


1.23. Show that 


P(A) = P(A 74 B) + P(A 2 B) (1.64) 
From the Venn diagram of Fig. 1-9, we see that 
A=(An B)U(An B) and (Arn B)A (AN B)=B (1.65) 


Thus, by axiom 3, we have 


P(A) = P(A 7 B) + P(A 9 B) 


1.24. Given that P(A) = 0.9, P(B) = 0.8, and P(A 7 B) = 0.75, find (a) P(A v B); (b) P(A 7 B); and (c) 
P(A > B). 
(a) By Eq. (1.29), we have 
P(A U B) = P(A) + P(B) — P(A 4 B) = 0.9 + 0.8 — 0.75 = 0.95 
(b) By Eq. (1.64) (Prob. 1.23), we have 
P(A 7 B) = P(A) — P(A 7 B) = 0.9 — 0.75 = 0.15 
(c) By De Morgan’s law, Eq. (1.14), and Eq. (1.25) and using the result from part (a), we get 
P(A 0 B)= P(A U B)=1— P(A B)=1—0.95 = 0.05 
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1.25. 


1.26. 


1.27. 


PROBABILITY 


For any threc events A,, A,, and A,, show that 
P(A, U Ay VU A 3) = P(A,) + P(A2) + P(A3) — P(A, 7 A) 
— P(A, 7 A 3) — P(A2 7 A3) + P(A, Az A3) 


Let B= A, u Aj. By Eq. (1.29), we have 
P(A, U B) = P(A,) + P(B) — P(A, 71 B) 
Using distributive law (/./2), we have 
A, A B= A, 0 (A,U A3) =(A, 0 AQ) U (A, 0 AQ) 
Applying Eq. (/.29) to the above event, we obtain 
P(A, 0 B)= P(A, 0 A) + P(A, O Ax) — P(A, 9 A) 9 (A, 29 A,)] 
= P(A, M Az) + P(A, 1 A3) — P(A, A Az 2 A3) 

Applying Eq. (1.29) to the set B = A, U Aj, we have 

P(B) = P(A, U Aj) = P(A,) + P(A;) — P(A, 79 A3) 
Substituting Eqs. (/.69) and (1.68) into Eq. (1.67), we get 


P(A, U Ay U Ax) = P(A,) + P(A) + P(A3) — P(A, 9 Az) — P(A, 29 A) 
— P(A, nN Ay) + P(A, 1 A, Ay) 


Prove that 
of U A.) < ¥ P(A) 
i=1 f=1 
which is known as Boole’s inequality. 
We will prove Eq. (1.70) by induction. Suppose Eq. (/.70) is true for n = k. 


dys)s im 


ms (Us)=*1(U4 Jom) 
({ 


<P A) + P(Ay.,) [by Eq. (/.33)] 
k+l 


< Y P(A) + P(Ays 1) = ¥ P(A) 


i=1 i=] 


[CHAP 1 


(1.66) 


(1.67) 


(1.68) 


(1.69) 


(1.70) 


Thus Eq. (/.70) is also true for n = k + L. By Eq. (1.33), Eq. (1.70) is true for n = 2. Thus, Eq. (1.70) is true 


forn > 2. 


Verify Eq. (1.31). 
Again we prove it by induction. Suppose Eq. (/.3/) is true for n = k. 


(UA) ¥ P(A) 


i= i=. 


> (s)-Alys)oa] 


Using the distributive law (1.16), we have 


(Ua) nas = (Apa Aya i= US=f6 
F i= 


rt 
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1.28. 


since A, 1 A; = @ fori #j. Thus, by axiom 3, we have 


kt k k+l 
(Ua) = (UA) + P(A.) = > P(A) 
i=l i=t 


which indicates that Eq. (/.3/) is also true for n =k + |. By axiom 3, Eq. (/.3/) is true for n = 2. Thus, it is 
true forn > 2. 


A sequence of events {A,,n > 1} is said to be an increasing sequence if [Fig. 1-10(a)] 


A, CA, O°: CA, CAL, Coe (1.71a) 
whereas it is said to be a decreasing sequence if [Fig. 1-10(b)] 
A, DA, D5: DA, > Ayy, Do (1.71b) 


(a) ()) 
Fig. 1-10 
If {A,, = 1} is an increasing sequence of events, we define a new event A,, by 
A, = limA, = J A; (1.72) 
n> x, f=1 
Similarly, if {A,,n > 1} is a decreasing sequence of events, we define a new event A,, by 
A,, = limA, = () 4; (1.73) 
non, i=] 
Show that if {A,, 2 > 1} is cither an increasing or a decreasing sequence of events, then 
lim P(A,) = P(A,,) (1.74) 


AyD 
which is known as the continuity theorem of probability. 


If{A,,n = 1} is an increasing sequence of events, then by definition 


Now, we define the events B,,n > 1, by 


BL =A, 
B, =A, 0A, 
B, =A, 0 An-1 


Thus, B, consists of those elements in A, that are not in any of the earlier A,, k <n. From the Venn 
diagram shown in Fig. 1-11, it is seen that B, are mutually exclusive events such that 


n a ~ 
(} B= 4; for alln > 1, and (J 8, = (JA, = A, 


i=l i=l f=1 i=l 
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Vy 


ee 


Thus, using axiom 3’, we have 


= lim Sr) = im PCC), 


a~o i=l noo 


lim af ( UJ A) = lim P(A,) (1.75) 


n> 0 i=t 


Next, if {A,, m > 1} is a decreasing sequence, then {A,, n> 1} is an increasing sequence. Hence, by Eq. 
(1.75), we have 


( C) Ai) ~ lim P(A, 


now 


From Eq. (1.19), 


Thus, al( Q A)] = lim P(A,) (1.76) 


atw 


Using Eq. (1.25), Eq. (1.76) reduces to 


1- ( (41) = lim[1 — P(A,)) = 1 — lim P(A,) 


no no 


Thus, ata (\A =m A,,) = lim P(A,) (1.77) 


fom oF 


Combining Eqs. (1.75) and (1.77), we obtain Eq. (1.74). 


EQUALLY LIKELY EVENTS 


1.29. Consider a telegraph source generating two symbols, dots and dashes. We observed that the dots 


were twice as likely to occur as the dashes. Find the probabilities of the dot’s occurring and the 
dash’s occurring. 
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1.30. 


1.31. 


1.32. 


From the observation, we have 
P(dot) = 2P(dash) 
Then, by Eq. (1.35), 
P(dot) + P(dash) = 3P(dash) = } 
Thus, P(dash) = 4 and P(dot) = % 


The sample space S of a random experiment is given by 
S = {a, b, ¢, d} 

with probabilities P(a) = 0.2, P(b) = 0.3, P(c) = 0.4, and P(d) = 0.1. Let A denote the event {a, b}, 
and B the event {b, c, d}. Determine the following probabilities: (a) P(A); (b) P(B); (c) P(A); (@ 
P(A © B); and (e) P(A 7 B). 

Using Eq. (1.36), we obtain 
(a) P(A) = Pla) + P(b) = 0.2 +03 =05 
(b) P(B) = P(b) + Plc) + P(d) =0.34+04+40.1 =08 
(c) A= {c,d}; P(A) = P(c) + P(d) =04 +01 =0.5 
(4) AU B= {a, b,c, d} =S; P(A U B) = P(S)=1 
(e) A B={b}; P(A FB) = P(b) = 0.3 


An experiment consists of observing the sum of the dice when two fair dice are thrown (Prob. 
1.5). Find (a) the probability that the sum is 7 and (b) the probability that the sum is greater than 
10. 


(a) Let ¢,, denote the elementary event (sampling point) consisting of the following outcome: ¢,, = (i, j), 
where i represents the number appearing on one die and j represents the number appearing on the 
other die. Since the dice are fair, all the outcomes are equally likely. So P(¢;;) = ~¥z. Let A denote the 
event that the sum is 7. Since the events ¢;, are mutually exclusive and from Fig. 1-3 (Prob. 1.5), we 
have 

P(A) = Pllig Y C25 YU bsa Y Cas U O52 Y Soi) 
= Pbig) + Plbrs) + Pls4) + P(Sas) + P(Ss2) + P(E6,) 
= 6(4) = % 
(b) Let B denote the event that the sum is greater than 10, Then from Fig. 1-3, we obtain 
P(B) = Plos6 Y S65 Y S66) = PlSse) + Pllgs) + Plooe) 
= 33) = ts 


There are n persons in a room. 

(a) What is the probability that at least two persons have the same birthday? 
(b) Calculate this probability for n = 50. 

(c) How large need n be for this probability to be greater than 0.5? 


(a) As each person can have his or her birthday on any one of 365 days (ignoring the possibility of 
February 29), there are a total of (365)" possible outcomes. Let A be the event that no two persons 
have the same birthday. Then the number of outcomes belonging to A is 


n(A) = (365)(364) «+» (365 -— n+ 1) 
Assuming that each outcome is equally likely, then by Eq. (1.38), 


_n(A) _ (365)(364) ++ (365 ~n +1) 


n(S) (365)" (1.78) 


P(A) 
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Let B be the event that at least two persons have the same birthday. Then B = A and by Eq. (1.25), 
P(B) = 1 — P(A). 


(b) Substituting » = 50 in Eq. (1.78), we have 
P(A)~ 0.03 and = P(B) & 1 ~ 0.03 = 0.97 
(c) From Eq. (1.78), when n = 23, we have 
P(A)x 0.493 and  P(B)=1— P(A) x 0.507 


That is, if there are 23 persons in a room, the probability that at least two of them have the same 
birthday exceeds 0.5. 


1.33. A committee of 5 persons is to be selected randomly from a group of 5 men and 10 women. 


(a) Find the probability that the committee consists of 2 men and 3 women. 
(b) Find the probability that the committee consists of all women. 


15 
ns) =(".) 


It is assumed that “random selection” means that each of the outcomes is equally likely. Let A be the 
event that the committee consists of 2 men and 3 women. Then the number of outcomes belonging to 


A is given by 
MA=\DK3 
(5) 
n(A) \2/\ 3 400 


nh Wy (18) TOON 
5 


(a) The number of total outcomes is given by 


Thus, by Eq. (1.38), 


(b) Let B be the event that the committee consists of all women. Then the number of outcomes belonging 


to Bis 
5\/10 
nar=(3)(5) 
(o's) 
_n(B) \OA5/ 36 
~ n(S) (*?) ~ 429 
5 


1.34. Consider the switching network shown in Fig, 1-12. It is equally likely that a switch will or 
will not work. Find the probability that a closed path will exist between terminals a and b. 


Thus, by Eq. (1.38), 


Fig. 1-12 
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Consider a sample space S of which a typical outcome is (1, 0, 0, 1), indicating that switches 1 and 4 are 
closed and switches 2 and 3 are open. The sample space contains 2* = 16 points, and by assumption, they 
are equally likely (Fig. 1-13). 

Let A;, i= 1, 2, 3, 4 be the event that the switch s, is closed. Let A be the event that there exists a 
closed path between a and b. Then 


A=A, U (A, 9 Ay) vv (Az 1 Ag) 
Applying Eq. (7.30), we have 
P(A) = PLA, Uv (A, 9 A3) U (AD 9 Aad] 

= P(A\) + P(A, 9 As) + P(A2 1 Ag) 
— P[A, 9 (A, 9 Ay) — PLA, 9 (Az 9 Aa)] — PU(A2 9 Ag) 9 (An 0 Aad] 
+ PLA, n (A, 1 A3) Nn (Az 1 Ay) 

= P(A,) + P(A, M Aj) + P(A, 7 Ay) 
— P(A, A A, 7 A3) — P(A, 9 AQ 9 AQ) — P(AQ A Ay TO AQ) 
+ P(A, NW A, A Ay TO Ay) 


Now, for example, the event A, ™ A, contains all elementary events with a 1 in the second and third 
places. Thus, from Fig. 1-13, we see that 


n(A,) = 8 n(A, 7 Ay) =4 n(A, 0 Ag) =4 
n(A, 0 Az Ay) = 2 n(A, O A, Ay) =2 
n(A, 0 Ay 9 Ay) = 2 nA, A A, AZNAD=1 


Thus, 
PA=h+ + h—&—-w- A+ we =H x 0688 


He ----|----?O 


Fig. 1-13 


1.35. Consider the experiment of tossing a fair coin repeatedly and counting the number of tosses 
required until the first head appears. 


(a) Find the sample space of the experiment. 
(b) Find the probability that the first head appcars on the kth toss. 
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(c) Verify that P(S) = 1. 
(a) The sample space of this experiment is 
S = {e,, ey, €3,...} = {e,: k = 1, 2, 3,..3 


where e, is the elementary event that the first head appears on the kth toss. 


(b) Since a fair coin is tossed, we assume that a head and a tail are equally likely to appear. Then P(H) = 
P(T) = 4. Let 


P(e,) = Dy k=1,2,3,... 


Since there are 2* equally likely ways of tossing a fair coin k times, only one of which consists of (k — 1) 
tails following a head we observe that 


! 
Pe)=Ph=x k=12,3,... (1.79) 
(c) Using the power series summation formula, we have 
oO wo ]Y oo 1 k : 
P(S)= YPe)= LY == L (5) =7o7=! (1.80) 
k=1 k=1 2 k=t 2 1 — tr 


Consider the experiment of Prob. 1.35. 


(a) Find the probability that the first head appears on an even-numbered toss. 
(b) Find the probability that the first head appears on an odd-numbered toss. 


(a) Let A be the event “the first head appears on an even-numbered toss.” Then, by Eq. (1.36) and using 
Eq. (1.79) of Prob. 1.35, we have 


fea) foe) 1 oo 1\" i ] 
PA)= prt Pg Pot" = SV Pm= S oe= ¥ (5) =J 373 
m=1 m=1 


= m=) 


(b) Let B be the event “the first head appears on an odd-numbered toss.” Then it is obvious that B = A. 
Then, by Eq. (/.25), we get 


P(B) = P(A) = 1 — P(A)=1—4=3 


As a check, notice that 


= = 1 12 f1\" 1/1 2 
P(B) = + + oo = —_— = - - = =— 
(B) = py + Py + Ps + 2X Paes XL smt 5 yy (3) +(— :) 3 


CONDITIONAL PROBABILITY 
1.37. Show that P(A |B) defined by Eq. (1.39) satisfies the three axions of a probability, that is, 


(a) P(A|B)=0 

(b) P(S|B)=1 

(c) P(A, U A,|B) = P(A,|B) + P(A,|B) if Ay 0 AL = SH 
(a) From definition (/.39), 


P(A 7 B) 


P(A|B) = P(B) 


P(B) > 0 


By axiom 1, P(A rm B) = 0. Thus, 
P(A|B)>90 
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(6) By Eq. (1.5), S 0 B= B. Then 


P(B) —-P(B) 
(c) By definition (1.39), 


PAy u A|B)= NDS) 
Now by Eqs. (1.8) and (1.11), we have 
(A, U Ay) A B=(A, 4 B) U (A, 1 B) 
and A, 4 A, = @ implies that (A, 4 B) MN (A, 4 B) = @. Thus, by axiom 3 we get 


PA, 0 B) + P(Az 0 B)_ P(A, 0B) , Plan 9 B) 
P(B) PCB) P(B) 
= P(A,|B)+ P(A,|B) if A, 14, =D 


P(A, U A,|B)= 


1.38. Find P(A|B) if (a) A 0 B= @,(b) Ac B,and(c) BCA. 
(a) IfA a B= @, then P(A m B) = P(S) = 0. Thus, 


P(A|B) = P(A 7 B)_ PO) 


P(B) —P(B) 
(6) If4< B,then 4 4 B= A and 
P(A] B) = PAO BD _ PUA) 
P(B) P(B) 
(c) IW BcA,then A mn B= Band 
P(A Bp) = 428) _ PB) _ 


PIB) P(B) 


1.39. Show that if P(A|B) > P(A), then P(B| A) > P(B). 


If PAB) = TOS A), then PLA > B) > PLAYP(B) Thus 
ppj Ay = PAO 8), PAPO) _ pp) or PLB A) > P(B) 


———> 
P(A) P(A) 


1.40. Consider the experiment of throwing the two fair dice of Prob. 1.31 behind you; you are then 
informed that the sum is not greater than 3. 


(a) Find the probability of the event that two faces are the same without the information given. 
(b) Find the probability of the same event with the information given. 


(a) Let A be the event that two faces are the same. Then from Fig. 1-3 (Prob. 1.5) and by Eq. (/.38), we 
have 


and 
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(b) Let B be the event that the sum is not greater than 3. Again from Fig. 1-3, we see that 
B= {(i, f: i + js 3} = {(1, 1), (1, 2), (2, D} 
and 
n(B) 3 1 
P(B) = —~ = >= 
(8) n(S) 36 12 
Now 4 - B is the event that two faces are the same and also that their sum is not greater than 3. 
Thus, 


AaB) | 
PA oa) =O 


Then by definition (/.39), we obtain 


Note that the probability of the event that two faces are the same doubled from % to 4 with the 
information given. 


Alternative Solution: 


There are 3 elements in B, and 1 of them belongs to A. Thus, the probability of the same event 
with the information given is 4. 


Two manufacturing plants produce similar parts. Plant 1 produces 1,000 parts, 100 of which are 
defective. Plant 2 produces 2,000 parts, 150 of which are defective. A part is selected at random 
and found to be defective. What is the probability that it came from plant 1? 


Let B be the event that “the part selected is defective,” and let A be the event that “the part selected 
came from plant 1.” Then A o B is the event that the item selected is defective and came from plant 1. 
Since a part is selected at random, we assume equally likely events, and using Eq. (1.38), we have 


P(A 0 B) = 125 =%5 
Similarly, since there are 3000 parts and 250 of them are defective, we have 
PLB) = $85 = i 


By Eq. (1.39), the probability that the part came from plant | is 


Alternative Solution: 


There are 250 defective parts, and 100 of these are from plant 1. Thus, the probability that the 


defective part came from plant 1 is 482 = 0.4. 


A lot of 100 semiconductor chips contains 20 that are defective. Two chips are selected at 
random, without replacement, from the lot. 


(a) What is the probability that the first one selected is defective? 


(b} What is the probability that the second one selected is defective given that the first one was 
defective? 
(c} What is the probability that both are defective? 
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(a) Let A denote the event that the first one selected is defective. Then, by Eq. (/.38), 
P(A) = 2% = 0.2 


(b) Let B denote the event that the second one selected is defective. After the first one selected is defective, 
there are 99 chips left in the lot with 19 chips that are defective. Thus, the probability that the second 
one selected is defective given that the first one was defective is 


P(B| A) = 28 = 0.192 
(c) By Eq. (1.41), the probability that both are defective is 
P(A © B) = P(B| A)P(A) = (48)(0.2) = 0.0384 


A number is selected at random from {1, 2, ..., 100). Given that the number selected is divisible 
by 2, find the probability that it is divisible by 3 or 5. 


Let A, = event that the number is divisible by 2 
A, = event that the number is divisible by 3 
A, = event that the number is divisible by 5 
Then the desired probability is 


PU(A3 vu As) 0 AQ] 


P(A As|A2)= Eq. (1.39 
(A; U As] A2) P(A;) (Eq. (7.39)] 
P[(A3 9 Aa) U (As 9 A))I 
= (Eq. (1.12)] 
P(A) 
P(A3 \ Az) + P(As 1 Az) — P(A3 0 Ag 1 AQ) 
= ——__——_— Eq. (1.29 
P(A.) LEq. (1.29)] 
Now A, © A, = event that the number is divisible by 6 
As © A, = event that the number is divisible by 10 
A, As © A, = event that the number 1s divisible by 30 
and P(A, 0 A) = th P(As 0 AD) = Te P(A, 0 As 1 Az) = Too 
+745 —Tes 23 
Thus, P(Ay U Ag| Ay) = O10 = 108 _ = _ 0.46 
Ta 50 


Let A,, A2,..., A, be events in a sample space S. Show that 
P(A, MA, O°+: 0 A,) = P(A,)P(A,|A,)P(A3| 4, 9 A2)°°* P(A,| A, A Az M** OV A,-4) 
(1.81) 


We prove Eq. (1.81) by induction. Suppose Eq. (/.8/) is true for n = k: 
P(A, A A, O°: 0 A,) = P(A,)P(A|A,)P(A3] Ay 0 An) *+ P(Agl Ay OO AQ A OV Ag 4) 
Multiplying both sides by P(A,.,|4; © Az 0 °°: -O A,), we have 
P(A, 0 Az O-+++ O Ay)P(Ags {41 0 AZ 0°°° O AY) = P(A, 0 AZO O Ags) 
and P(A, 0 A, O77 O Ay) = P(A,)P(A,| A,)P(A3] A, O Az) > °* PlAga |] 4) 9 AQ 0-7 A,) 


Thus, Eq. (1.81) is also true for n = k + 1. By Eq, (1.41), Eq, (1.81) is true for n = 2. Thus Eq. (1.81) is true 
forn > 2. 


Two cards are drawn at random from a deck. Find the probability that both are aces. 


Let A be the event that the first card is an ace, and B be the event that the second card is an ace. The 
desired probability is P(B 7 A). Since a card is drawn at random, P(A) = 3. Now if the first card is an ace, 
then there will be 3 aces left in the deck of 51 cards. Thus P(B| A) = &. By Eq. (1.41), 


P(B O A) = P(B| A)P(A) = (35 $2) = zit 


28 PROBABILITY [CHAP 1 


Check: 


By counting technique, we have 


() 
2 (443) 1 


me ODS 0 "92481" 2 
2 


1.46. There are two identical decks of cards, each possessing a distinct symbol so that the cards from 
each deck can be identified. One deck of cards is laid out in a fixed order, and the other deck is 
shuffled and the cards laid out one by one on top of the fixed deck. Whenever two cards with the 
same symbol occur in the same position, we say that a match has occurred, Let the number of 
cards in the deck be 10. Find the probability of getting a match at the first four positions. 


Let A;, i = 1, 2, 3, 4, be the events that a match occurs at the ith position. The required probability is 
P(A; A A, 0 Az 0 Ay) 
By Eq. (1.81), 
P(A, A Ay A Ax 0 Ag) = P(A,)P(A,|A,)P(A3] 4, 0 A2)P(Ag| 41 9 Az 2 A3) 


There are 10 cards that can go into position 1, only one of which matches. Thus, P(A,) = 75. P(A2|A,) is 
the conditional probability of a match at position 2 given a match at position 1. Now there are 9 cards left 
to go into position 2, only one of which matches. Thus, P(A,|A,) = 3. In a similar fashion, we obtain 
P(A,|A, © Az) = 4 and P(A4|A,; A A, 0 A;) = 5. Thus, 


P(A, 0 Az As 0 Aa) = (ToXSXBN4) = sorD 


TOTAL PROBABILITY 
1.47. Verify Eq. (1.44). 
Since B © S = B [and using Eq. (1.43)], we have 
B=BOAS=Bo (A, Vv A, U 7+: UA) 
=(BA 4A:) U(BN A) Us U(BOA,) (1.82) 
Now the events B 4 A;,i = 1, 2,..., m, are mutually exclusive, as seen from the Venn diagram of Fig. 1-14. 


Then by axiom 3 of probability and Eq. (1.41), we obtain 


P(B) = P(B nS) = 5 PIB A) = Y BI A)PLA) 


i 1 


BOA, B 
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1.48. Show that for any events A and B in S, 
P(B) = P(B| A)P(A) + P(B| A)P(A) (1.83) 
From Eq. (1.64) (Prob. 1.23), we have 
P(B) = P(B ny A)+ PIB A) 
Using Eq. (1.39), we obtain 
P(B) = P(B| A)P(A) + P(B| A)P(A) 


Note that Eq. (1.83) is the special case of Eq. (1.44). 


1.49. Suppose that a laboratory test to detect a certain disease has the following statistics. Let 


A =event that the tested person has the disease 
B = event that the test result is positive 


It is known that 
P(B| A) = 0.99 and P(B| A) = 0.005 


and 0.1 percent of the population actually has the disease. What is the probability that a person 
has the disease given that the test result is positive? 


From the given statistics, we have 
P(A) = 0.001 then P(A) = 0.999 
The desired probability is P(A | B). Thus, using Eqs. (1.42) and (1.83), we obtain 


_ P(B|A)P(A) 
P(A|B) = P(B| A)P(A) + P(B| A)P(A) 
(0.99\0.001) 


= (0.99 "0.001) + (0,005,099) ~ 216° 


Note that in only 16.5 percent of the cases where the tests are positive will the person actually have the 
disease even though the test is 99 percent effective in detecting the disease when it is, in fact, present. 


1.50. A company producing electric relays has three manufacturing plants producing 50, 30, and 20 
percent, respectively, of its product. Suppose that the probabilities that a relay manufactured by 
these plants is defective are 0.02, 0.05, and 0.01, respectively. 


(a) Ifa relay is selected at random from the output of the company, what is the probability that 
it is defective? 

(b) Ifa relay selected at random is found to be defective, what is the probability that it was 
manufactured by plant 2? 


(a) Let B be the event that the relay is defective, and let A; be the event that the relay is manufactured by 
plant i (i = 1, 2, 3), The desired probability is P(B). Using Eq. (1.44), we have 


3 
P(B) = ¥ P(B| A))P(A)) 
i=l 


= (0.02\0.5) + (0,050.3) + (0.01)(0.2) = 0.027 
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(b) The desired probability is P(A, |B). Using Eq. (1.42) and the result from part (a), we obtain 


Two numbers are chosen at random from among the numbers | to 10 without replacement. Find 
the probability that the second number chosen is 5. 

Let A,;, i= 1, 2,..., 10 denote the event that the first number chosen is i. Let B be the event that the 
second number chosen is 5. Then by Eq. (/.44), 


10 


P(B) = ), P(B| A)P(A)) 
i= 


Now P(A;) = 75. P(B}A)) is the probability that the second number chosen is 5, given that the first is i. If 
i= 5, then P(B| A,) = 0. If i 4 5, then P(B| A, = 4. Hence, 


10 


P(B) = 3 P(B| A,)P(A) = 9370) = 16 


Consider the binary communication channel shown in Fig. 1-15. The channel input symbol X 
may assume the state 0 or the state 1, and, similarly, the channel output symbol Y may assume 
either the state 0 or the state 1. Because of the channel noise, an input 0 may convert to an 
output 1 and vice versa. The channel is characterized by the channel transition probabilities po, 
do. Pi, and q,, defined by 


Po = P(¥,| Xo) and Py = Plyo|X1) 
Go = P(¥o|Xo) and q, = P(yi 1x1) 
where x) and x, denote the events (X = 0) and (X = 1), respectively, and yo and y, denote the 
events (Y = 0) and (Y = 1), respectively. Note that po + qo = 1 = p, + qy. Let P(xo) = 0.5, po = 
0.1, and p, = 0.2. 
(a) Find P(yo) and P(y,). 
(b) If aQ was observed at the output, what is the probability that a 0 was the input state? 
{c) Ifa 1 was observed at the output, what is the probability that a | was the input state? 
(d) Calculate the probability of error P,. 


My 


(a) We note that 
P(x,) = 1 — P(x) = 1 -—0.5 = 0.5 
P(yo|Xo) = do = 1 — po = 1 - 0.1 = 09 
POY (x) =4, =1l-—p, = 1-02=08 
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Using Eq. (1.44), we obtain 
P(y9) = Plyq| Xo)P(%o) + Plo] x1)P(x1) = 0.9(0.5) + 0.2(0.5) = 0.55 
P(yy) = P(yy | X0}P(%0) + Ply | 41) P(x.) = 0.1(0.5) + 0.8(0.5) = 0.45 
(b) Using Bayes’ rule (1.42), we have 


P(Xo)P(Yo1 Xo) _ (0.5)(0.9) 


P(yo) 055 7 088 


P(Xo| Yo) = 


(c) Similarly, 


Plxi)P(yi 1x1) _ (0.50.8) 
P(y,) 045 


P(x, | yi) = 


(d) The probability of error is 
P, = P(y,|Xo)P(x9) + Plyo 1X )P(x,) = 0.1(0.5) + 0.2(0.5) = 0.15. 


INDEPENDENT EVENTS 
1.53. Let A and B be events in a sample space S. Show that if A and B are independent, then so are (a) 
A and B, (b) A and B, and (c) A and B. 
(a) From Eq. (1.64) (Prob. 1.23), we have 
P(A) = P(A 9 B) + P(A 1 B) 
Since A and B are independent, using Eqs. (/.46) and (1.25), we obtain 
P(A 7 B) = P(A) — P(A 7 B) = P(A) — P(A)P(B) 
= P(A)[1 ~ P(B)] = P(A)P(B) (1.84) 
Thus, by definition (1.46), A and B are independent. 
(b) Interchanging A and B in Eq. (1.84), we obtain 
P(B 0 A) = P(B)P(A) 
which indicates that A and B are independent. 
(c) We have 
P(A 0 B)= P[(A vu B)] (Eq. (1.14)] 
1 — P(A uv B) (Eq. (1.25)] 
1 — P(A)— P(B) + P(A B) — (Eq. (1.29)] 
1 — P(A) — P(B) + P(A)P(B) (Eq. (1.46)] 
1 — P(A) — P(B)[L — P(A) 
= (1 — P(A)]EL — P(B)] 
= P(A)P(B) (Eq. (/.25)] 


It 


Hence, 4 and B are independent. 


1.54. Let A and B be events defined in a sample space S. Show that if both P(A) and P(B) are nonzero, 
then events A and B cannot be both mutually exclusive and independent. 


Let A and B be mutually exclusive events and P(A) #0, P(B) #0. Then P(A 4 B) = P(@) = 90 but 
P(A)P(B) # 0. Since 


P(A co B) # P(A)P(B) 


A and B cannot be independent. 


1.55. Show that if three events A, B, and C are independent, then A and (B vu C) are independent. 
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We have 
PLAN (BUC) =P[(A 7 B)U (AN C)) (Eq. (1.12)] 
=P(AANB)+P(ANC)—P(ANBOC) — (Eq. (1.29)] 
= P(A)P(B) + P(A)P(C) — P(A)P(B)P(C) (Eq. (1.50)] 
= P(A)P(B) + P(A)P(C) — P(A)P(B 4 C) (Eq. (1.50)] 
= P(A)[P(B) + P(C) — P(B a C)] 
= P(A)P(B & C) (Eq. (1.29)] 


Thus, A and (B u C) are independent. 


Consider the experiment of throwing two fair dice (Prob. 1.31). Let A be the event that the sum 
of the dice is 7, B be the event that the sum of the dice is 6, and C be the event that the first die is 
4. Show that events A and C are independent, but events B and C are not independent. 

From Fig. 1-3 (Prob. 1.5), we see that 


A= {16> S251 S34 baa» Ss2> Ser} 
B= (b155 S245 S33» Sars Ssh 
C= lai S42. S43» baa, Sas> Saot 


and ANC= {S43} BoCs= {Car} 
Now P(A)=-& = P(B) = PC)=% =% 
and P(A 1 C) = xg = P(A)P(C) 


Thus, events A and C are independent. But 
P(B OC) = 35 # P(B)P(C) 


Thus, events B and C are not independent. 


In the experiment of throwing two fair dice, let A be the event that the first die is odd, B be the 
event that the second die is odd, and C be the event that the sum is odd. Show that events A, B, 
and C are pairwise independent, but A, B, and C are not independent. 


From Fig. 1-3 (Prob. 1.5), we see that 
P(A) = P(B) = P(C) = 38 = 3 
P(A KB) =PiANC)=P(BAC)=%=4 
Thus P(A 4 B) = 4 = P(A)P(B) 
P(A nC) =4 = P(A)P(C) 
P(B A C)=4 = P(B)P(C) 


which indicates that A, B, and C are pairwise independent. However, since the sum of two odd numbers is 
even, {A 1 BO C)= @ and 


P(A BO C)=0 4 § = P(A)P(B)P(C) 


which shows that A, B, and C are not independent. 


A system consisting of n separate components is said to be a series system if it functions when all 
n components function (Fig. 1-16). Assume that the components fail independently and that the 
probability of failure of component i is p;, i= 1, 2,..., n. Find the probability that the system 


functions. 


Fig. 1-16 Series system. 
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Let A; be the event that component s, functions. Then 
P(A) = 1 — P(A) = 1 — p; 


Let A be the event that the system functions. Then, since A,’s are independent, we obtain 


Rn n 


P(A) = e( (\a))= [] P(A) = T] 0 - pd (1.85) 
i=1 i=) 


i= i=] 


A system consisting of n separate components is said to be a parallel system if it functions when 
at least one of the components functions (Fig. 1-17). Assume that the components fail indepen- 
dently and that the probability of failure of component i is p;, i = 1, 2,..., n. Find the probabil- 
ity that the system functions. 


Fig. 1-17 Parallel system. 


Let A; be the event that component s, functions. Then 
P(A)) = p; 


Let A be the event that the system functions. Then, since A,’s are independent, we obtain 


pay= 1 = Ptay= 1 =e (4) = 1 - Tle, (1.86) 
i=1 i=l 


Using Egs. (1.85) and (1.86), redo Prob. 1.34, 


From Prob. 1.34, p; = 4, i= 1, 2, 3, 4, where p; is the probability of failure of switch s;. Let A be the 
event that there exists a closed path between a and b. Using Eq. (1.86), the probability of failure for the 
parallel combination of switches 3 and 4 is 

Psa = Ps Pa = (24) = 4 
Using Eq. (1.85), the probability of failure for the combination of switches 2, 3, and 4 is 


Poag=t—-(—tyt—-s)=1-¢=3 
Again, using Eq. (1.86), we obtain 


P(A) = 1 ~ PyPs4 = 1 (GN8) = 1 =H 


A Bernoulli experiment is a random experiment, the outcome of which can be classified in but 
one of two mutually exclusive and exhaustive ways, say success or failure. A sequence of Ber- 
noulli trials occurs when a’ Bernoulli experiment is performed several independent times so that 
the probability of success, say p, remains the same from trial to trial. Now an infinite sequence of 
Bernoulli trials is performed. Find the probability that (a) at least 1 success occurs in the first x” 
trials; (b) exactly k successes occur in the first n trials; (c) all trials result in successes. 


(a) In order to find the probability of at least | success in the first n trials, it is easier to first compute the 
probability of the complementary event, that of no successes in the first n trials. Let A; denote the event 
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of a failure on the ith trial. Then the probability of no successes is, by independence, 
P(A, 0 Az °° 0 A,) = P(A,)P(A2) «++ P(A,) = (1 — py” (1.87) 
Hence, the probability that at least 1 success occurs in the first n trials is 1 — (1 — p)’. 


(b) In any particular sequence of the first n outcomes, if k successes occur, where k = 0, 1, 2, ..., n, then 


n — k failures occur. There are (7) such sequences, and each one of these has probability p“(1 — p)"~*. 


k 
Thus, the probability that exactly k successes occur in the first n trials is given by (Tru — py*, 


(c) Since A; denotes the event of a success on the ith trial, the probability that all trials resulted in 
successes in the first n trials is, by independence, 
P(A, 0 A, O-++ 0 A,) = P(A,)P(A,) «++ P(A,) = p" (1.88) 


Hence, using the continuity theorem of probability (/.74) (Prob. 1.28), the probability that all trials 
result in successes is given by 


of (41) = (tim a 4) = lim o( (A) = limp" = " , ‘ ; 


naa i=l ate ate 


Let S be the sample space of an experiment and S = {A, B, C}, where P(A) = p, P(B) = q, and 
P(C) = r. The experiment is repeated infinitely, and it is assumed that the successive experiments 
are independent. Find the probability of the event that A occurs before B. 


Suppose that A occurs for the first time at the ath trial of the experiment. If A is to have occurred 
before B, then C must have occurred on the first (n ~ 1) trials. Let D be the event that A occurs before B. 
Then 


where D,, is the event that C occurs on the first (n — 1) trials and A occurs on the ath trial. Since D,’s are 
mutually exclusive, we have 


Since the trials are independent, we have 


P(D,) = [P(C)]"" P(A) =r" 'p 


Thus, PD) = Yr pap y hae 
ns \) ar P pd l—r p+q 
_ P(A) 
or P(D) = Pays PB) (1.89) 


sincep+q+r=l. 


In a gambling game, craps, a pair of dice is rolled and the outcome of the experiment is the sum 
of the dice. The player wins on the first roll if the sum is 7 or 11 and loses if the sum is 2, 3, or 12. 
If the sum is 4, 5, 6, 8, 9, or 10, that number is called the player’s “point.” Once the point is 
established, the rule is: If the player rolls a 7 before the point, the player loses; but if the point is 
rolled before a 7, the player wins. Compute the probability of winning in the game of craps. 


Let A, B, and C be the events that the player wins, the player wins on the first roll, and the player gains 
point, respectively. Then P(A) = P(B) + P(C). Now from Fig. 1-3 (Prob, 1.5), 


P(B) = P(sum = 7) + P(sum = 11)=%4+4%=% 
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Let A, be the event that point of k occurs before 7. Then 
P(C) = y P(A,)P(point = k) 


ke (4, 5,6, 8.9, 10} 


By Eq. (1.89) (Prob. 1.62), 


PA) P(sum = k) 1.90) 
 P(sum = k) + P(sum = 7) Ul. 
Again from Fig. 1-3, 
P(sum = 4) = 4% Pisum = 5) — %& P(sum = 6) — 5 
P(sum = 8) = P(sum = 9) = + P(sum = 10) = 
Now by Eq. (1.90), 
P(Ag) = 3 P(As) =3 P(A,) = 
P(As) = 7 P(Ag) = § P(A\o) = 4 


Using these values, we obtain 


Supplementary Problems 


Consider the experiment of selecting items from a group consisting of three items {a, b, c}. 


(a) Find the sample space S, of the experiment in which two items are selected without replacement. 


(6) Find the sample space S, of the experiment in which two items are selected with replacement. 
Ans. (a) S, = (ab, ac, ba, be, ca, cb} 

(b) S, = aa, ab, uc, ba, bb, be, ca, cb, ec} 
Let A and B be arbitrary events. Then show that A c Bifand only ifA U B=B. 


Hint: Draw a Venn diagram. 


Let A and B be events in the sample space S. Show that if A c B, then Bc A. 


Hint: Draw a Venn diagram. 


Venfy Eq. (1./3). 


Hint: Draw a Venn diagram. 


Let A and B be any two events in S. The difference of B and A, denoted by B — A, is defined as 
B-A=BOA 
The symmetric difference of A and B, denoted by A A B, is defined by 
AA B=(A —- B) vu (B-- A) 
Show that 
AAB=(AU B)A(AA B) 


Hint: Draw a Venn diagram. 
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Let A and B be any two events in S. Express the following events in terms of A and B. 
(a) At least one of the events occurs. 


(b) Exactly one of two events occurs. 


Ans. (a) AU By(b)AAB 


Let A, B, and C be any three events in S. Express the following events in terms of these events. 


(a) Either B or C occurs, but not A. 
(b) Exactly one of the events occurs. 
(c) Exactly two of the events occur. 


Ans. (a2) AN(BUC) 

() {[AN(BUC}U{BA(AUCH LU {Ca (A vu B)} 

() {ANB ACV LANC A BYU (BAC a 4} 

A random experiment has sample space S = {a, b, c}. Suppose that P({a, c}) = 0.75 and P({b, c)} = 0.6. 
Find the probabilities of the elementary events. 

Ans. P(a) = 0.4, P(b) = 0.25, P(c) = 0.35 

Show that 

(a) P(A B)=1— P(A B) 

(b) P(A o B) > 1 — P(A) — P(B) 

(c) P(A A B) = P(A Lv B)— P(A 2B) 


Hint: (a) Use Eqs. (7.15) and (1.25). 
(b) Use Eqs. (1.29), (1.25), and (1.28). 
(c) See Prob. 1.68 and use axiom 3. 


Let A, B, and C be three events in S. If P(A) = P(B)=4, P(C)=4, P(A 7 B)= 4, P(A NC) =}, and 
P(B nC) =), find P(A UV BUC). 


Ans. +3 


Verify Eq. (1.30). 


Hint: Prove by induction. 


Show that 
P(A, A AZO ++) A A,) 2 P(A) + P(A2) +°°> + P(A,) —( — 1) 
Hint: Use induction to generalize Bonferroni's inequality (1.63) (Prob. 1.22). 
In an experiment consisting of 10 throws of a pair of fair dice, find the probability of the event that at least 
one double 6 occurs. 


Ans, 0.246 


Show that if P(A) > P(B), then P(A |B) > P(B| A). 

Hint: Use Eqs. (1.39) and (1.40), 

An urn contains 8 white balls and 4 red balls. The experiment consists of drawing 2 balls from the urn 
without replacement. Find the probability that both balls drawn are white. 


Ans. 0.424 
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1.79. 


1.80. 


1.81. 


1.82. 


There are 100 patients in a hospital with a certain disease. Of these, 10 are selected to undergo a drug 
treatment that increases the percentage cured rate from 50 percent to 75 percent. What is the probability 
that the patient received a drug treatment if the patient is known to be cured? 


Ans. 0.143 

Two boys and two girls enter a music hall and take four seats at random in a row. What is the probability 
that the girls take the two end seats? 

Ans, } 

Let A and B be two independent events in S. It is known that P(A 7 B) = 0.16 and P(A uU B) = 0.64. Find 
P(A) and P(B). 

Ans, P(A) = P(B) =04 

The relay network shown in Fig. 1-18 operates if and only if there is a closed path of relays from left to 


right. Assume that relays fail independently and that the probability of failure of each relay is as shown. 
What is the probability that the relay network operates? 


Ans. 0.865 


Fig. 1-18 


Chapter 2 


Random Variables 


2.1 INTRODUCTION 


In this chapter, the concept of a random variable is introduced. The main purpose of using a 
random variable is so that we can define certain probability functions that make it both convenient 
and easy to compute the probabilities of various events. 


2.2 RANDOM VARIABLES 
A. Definitions: 


Consider a random experiment with sample space S. A random variable X(€) is a single-valued 
real function that assigns a real number called the value of X(Q) to each sample point ¢ of S. Often, we 
use a Single letter X for this function in place of X(¢) and use r.v. to denote the random variable. 

Note that the terminology used here is traditional. Clearly a random variable is not a variable at 
all in the usual sense, and it is a function. 

The sample space S$ is termed the domain of the r.v. X, and the collection of all numbers [values 
of X({)] is termed the range of the r.v. X. Thus the range of X is a certain subset of the set of all real 
numbers (Fig. 2-1). 

Note that two or more different sample points might give the same value of X(Q), but two differ- 
ent numbers in the range cannot be assigned to the same sample point. 


xX) R 


Fig. 2-1 Random variable X as a function. 


EXAMPLE 2.1. [n the experiment of tossing a coin once (Example 1.1), we might define the r.v. X as (Fig. 2-2) 
X(H)=1 X(T) =0 
Note that we could also define another r.v., say Y or Z, with 


Y(H)=0, Y(T)=1 or Z(H)=0,2(T)= 


B. Events Defined by Random Variables: 
If X is ar.v. and x is a fixed real number, we can define the event (X = x) as 
(X = x) = {C: X(0) = x} (2.1) 
Similarly, for fixed numbers x, x,, and x,, we can define the following events: 


(X <x) = {€: X(N <x} 
(X > x) = {€: X(Q > x} (2.2) 
(xy < X < xq) = {6: x, < X(Q) < xp} 


38 
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0 1 R 


Fig. 2-2. One random variable associated with coin tossing. 


These events have probabilities that are denoted by 
P(X = x) = P{E: X(Q) = x} 
P(X <x) = P{E: X(Q) < x} 
P(X > x) = P{E: X(Q) > x} 
P(x, < X < x2) = P{E: x, < X(Q) < x2} 


wu 


(2.3) 


EXAMPLE 2.2 In the experiment of tossing a fair coin three times (Prob. 1.1), the sample space S, consists of 
eight equally likely sample points S,; = {HHH,..., TTT}. If X is the r.v. giving the number of heads obtained, find 
(a) P(X = 2); (b) P(X < 2). 


(a) Let Ac S, be the event defined by X = 2. Then, from Prob, 1.1, we have 
A =(X = 2)= {€: X(Q) = 2} = {HHT, HTH, THH} 
Since the sample points are equally likely, we have 
P(X = 2) = P(A) =3 
(b) Let BS, be the event defined by X < 2. Then 
B=(X <2)={0: X(0) < 2} = (ATT, THT, TTH, TTT} 
and P(X <2)= P(B)=% = 3 


2.3. DISTRIBUTION FUNCTIONS 
A. Definition: 
The distribution function [or cumulative distribution function (cdf)] of X is the function defined by 
F(x) = P(X < x) -O<x<@ (2.4) 


Most of the information about a random experiment described by the r.v. X is determined by the 
behavior of Fy(x). 


B. Properties of F(x): 


Several properties of F y(x) follow directly from its definition (2.4). 


1 O< Fylx)<1 (2.5) 
2. Fy(x,) < Fy(x2) ifx, < x; (2.6) 
3. lim Fy(x) = Fy(co) = 1 (2.7) 
4. lim Fy(x) = Fy(—«%) =0 (2.8) 
5. lim Fy(x) = F,{a*) = Fa) a> = limate (2.9) 


xcat 0<c70 
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Property 1 follows because F,(x) is a probability. Property 2 shows that F(x) is a nondecreasing 
function (Prob. 2.5). Properties 3 and 4 follow from Eggs. (1.22) and (1.26): 
lim P(X < x) = P(X < «) = P(S) = 1 


lim P(X <x)=P(X < —«) = P(Q) =0 


x7 — 0 


Property 5 indicates that F,(x) is continuous on the right. This is the consequence of the definition 
(2.4). 


Table 2.1 
x F(x) 
0 (TTT) 4 
(TTT, TTH, THT, HTT} q=4 
2 {TTT, TTH, THT, HTT, HHT, HTH, THH} 4 
3 S 1 
4 S J 


EXAMPLE 2.3 Consider the r.v. X defined in Example 2.2. Find and sketch the cdf F(x) of X. 

Table 2.1 gives Fy(x) = P(X < x) for x = —1,0, 1, 2, 3, 4. Since the value of X must be an integer, the value of 
F(x) for noninteger values of x must be the same as the value of Fy(x) for the nearest smaller integer value of x. 
The F(x) is sketched in Fig. 2-3. Note that Fy(x) has jumps at x = 0, 1, 2, 3, and that at each jump the upper value 
is the correct value for Fy(x). 


Fx 


a 0 | 2 3 4 x 


Fig. 2-3 


C. Determination of Probabilities from the Distribution Function: 


From definition (2.4), we can compute other probabilities, such as P(a < X <b), P(X > a), and 
P(X < b) (Prob. 2.6): 


Pla < X <b) = Fyx(b) — F,(a) (2.10) 
P(X > a) =1— Fy{a) (2.11) 
P(X < b) = Fy(b-) b> = lim b—e (2.12) 
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2.4 DISCRETE RANDOM VARIABLES AND PROBABILITY MASS FUNCTIONS 
A. Definition: 


Let X be ar.v. with cdf Fy(x). If Fy(x) changes values only in jumps (at most a countable number 
of them) and is constant between jumps—that is, F(x) is a staircase function (see Fig. 2-3}— then X 
is called a discrete random variable. Alternatively, X is a discrete r.v. only if its range contains a finite 
or countably infinite number of points. The r.v. X in Example 2.3 is an example of a discrete r.v. 


B. Probability Mass Functions: 


Suppose that the jumps in F(x) of a discrete r.v. X occur at the points x,, x2, ..., where the 
sequence may be either finite or countably infinite, and we assume x; < x, if i <j. 
Then FAx;) — Fy(x;- 1.) = P(X < x) — P(X < x;_,) = P(X =x, (2.13) 
Let Py(x) = P(X = x) (2.14) 


The function p,(x) is called the probability mass function (pmf) of the discrete r.v. X. 


Properties of py(x): 
1 O< pylxy) <1 k=1,2,... (2.15) 
2. px{x) = 0 if x # x, (k = 1, 2,...) (2.16) 
3. » Px(X,) = 1 (2.17) 


The cdf F(x) of a discrete r.v. X can be obtained by 
Fy(x) = P(X Sx) = DY) pxlx) (2.18) 


XkSx 


2.55 CONTINUOUS RANDOM VARIABLES AND PROBABILITY DENSITY FUNCTIONS 
A. Definition: 


Let X be a r.v. with cdf Fy(x). If Fy{x) is continuous and also has a derivative dF ,(x)/dx which 
exists everywhere except at possibly a finite number of points and is piecewise continuous, then X is 
called a continuous random variable. Alternatively, X is a continuous r.v. only if its range contains an 
interval (either finite or infinite) of real numbers. Thus, if X is a continuous r.v., then (Prob. 2.18) 


P(X =x) =0 (2.19) 
Note that this is an example of an event with probability 0 that is not necessarily the impossible event 
In most applications, the r.v. is either discrete or continuous. But if the cdf Fy(x) of a rv. X 


possesses features of both discrete and continuous r.v.’s, then the r.v. X is called the mixed r.v. (Prob. 
2.10). 


B. Probability Density Functions: 


_ dF (x) 


Let Sx) dx 


(2.20) 


The function f,(x) is called the probability density function (pdf) of the continuous r.v. X. 
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Properties of f(x): 

1. fy(x) 20 

2. [" Sy(x) dx =1 

3. f,(x) is piecewise continuous. 

4. Pla<X <bj= [100 dx 

The cdf F(x) of a continuous r.v. X can be obtained by 


Fy(x) = P(X < x)= [- Fx) de 


By Eq. (2.19), if X is a continuous r.v., then 


Pla< X <b)=P(a< X <b)=Plas X <b) = Pla< X <b) 


= [ f(x) dx = Fy(b) — F(a) 


2.6 MEAN AND VARIANCE 
A. Mean: 


The mean (or expected value) of a r.v. X, denoted by py or E(X), is defined by 


DY. Xx Pry) X: discrete 
By = E(X) = " ~ 
| xfy(x) dx -X: continuous 


B. Moment: 


The nth moment of ar.v. X is defined by 


Y Xq"Px(X%) X: discrete 
E(X") = k » 


x"f{x) dx X: continuous 


7 « 


Note that the mean of X is the first moment of X. 


C. Variance: 
The variance of a r.v. X, denoted by a,” or Var(X), is defined by 
oy? = Var(X) = E{Lx _ E(X)}*} 
Thus, 
> (%, — Hx)? Pxlx,) X: discrete 


k 
00 


Ox = 
| (x — py)? fy(x) dx X: continuous 
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(2.21) 


(2.22) 


(2.23) 


(2.24) 


(2.25) 


(2.26) 


(2.27) 


(2.28) 


(2.29) 
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Note from definition (2.28) that 
Var(X) > 0 (2.30) 


The standard deviation of a rv. X, denoted by c,, is the positive square root of Var(X). 
Expanding the right-hand side of Eq. (2.28), we can obtain the following relation: 


VarX) = E(X*) — [E(X)]? (2.37) 


which is a useful formula for determining the variance. 


27 SOME SPECIAL DISTRIBUTIONS 


In this section we present some important special distributions. 


A. Bernoulli Distribution: 
Ary, X is called a Bernoulli tr.v. with parameter p if its pmfis given by 
pxlk) = P(X = k) = pX(L - p)'* k=0,1 (2.32) 
where 0 < p < I, By Eq. (2,18), the cdf F ,{x) of the Bernoulli r.v. X is given by 


0 x<0 
Fx) =4l—-p O<x<] (2.33) 
I x21 
Figure 2-4 illustrates a Bernoulli distribution. 
Pyfas Fytx) 


Fig. 2-4 Bernoulli distribution. 


The mean and variance of the Bernoulli r.v. X are 


By = E(X) =p (2.34) 
oy? = Var(X) = pil — p) (2.35) 
A Bernoulli r.v. X is associated with some experimcot where an outcome can be classified as 


either a “success” or a “failure,” and the probability of a success is p and the probability of a failure is 
1 — p. Such experiments are often called Bernoulli trials (Prob. 1.61). 
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B. Binomial Distribution: 


Ar.v. X is called a binomial r.v. with parameters (n, p) if its pmf is given by 


px(k) = P(X = ky = (7) —py®  k=0,1..,0 


ny al 
kk} ki(n — ky! 


which is known as the binomial coefficient. The corresponding cdf of X is 


where 0 < p < | and 


F {x)= y @at —pyr ko nexcntl 


Figure 2-5 illustrates the binomial distribution for n = 6 and p = 0.6. 


Btn 
me 0 0.9538 
I 
! 
id OX : 
oe—— 
oa 
a4 0.2765 6 
O.2 1.1866 os 
p18? 
al 2 
0.0269 0.467 

ooxle—--F -L. eae . ool | 
v i 2 a 4 s f c § 


(wo 


Fig. 2-5 Binomial distribution with n = 6, p = 0.6. 


The mean and variance of the binomial r.v. X are (Prob. 2.28) 


Ky = E(X) = up 
oy? = Var(X)} = mp(l — p) 
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(2.36) 


(2.37) 


(2.38) 
(2.39) 


A binomial r.v. X is associated with some experiments in which » independent Bernoulli trials are 
performed and X represents the number of successes that occur in the n trials, Note that a Bernoulli 


rv. is just a binomial r.v. with paramcters (1, p). 


C. Poisson Distribution: 


A cv. X is called a Poisson r.v. with parameter 2 (> 0) if its pmf is given by 


Ae 
pxk) = P(X =k)=e 4 i 


k=0,1,... 
The corresponding edf of X ts 


Fy(x}=e7* 


Ak 
i n<ioxen+] 


# 
k-0 


Figure 2-6 illustrates the Poisson distribution for A = 3. 
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(2.41) 
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0.1992 
lone 


0.0216 0.0081 


Fig. 2-6 Poisson distribution with A = 3. 


The mean and variance of the Poisson r.v. X are (Prob. 2.29) 


Hy = E(X) =A 
ay? = Var(X) =A 


0.4232 


0.9664 


ai 


0.988 0.996) 
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(2.42) 
(2.43) 


The Poisson r.v. has a tremendous range of applications in diverse areas because it may be used 
as an approximation for a binomial r.v. with parameters (n, p) when n is large and p is small enough 


so that np is of a moderate size (Prob. 2.40). 
Some examples of Poisson r.v.’s include 


The number of telephone calls arriving at a switching center during various intervals of time 


2. The number of misprints on a page of a book 


The number of customers entering a bank during various intervals of time 


D. Uniform Distribution: 
Ar.v, X is called a uniform r.v. over (a, 5) if its pdf is given by 
1 
feo) = boa a<x<b 
0 otherwise 
The corresponding cdf of X is 
0 xa 
Fy(x) = — a<x<b 
! x >b 


Figure 2-7 illustrates a uniform distribution. 
The mean and variance of the uniform r.v. X are (Prob. 2.31) 


a+h 
Hy = E(X) = ) 


(2.44) 


(2.45) 


(2.46) 


(2.47) 
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” 


Fig. 2-7 Uniform distribution over (a, b). 


A uniform r.v, X is often used where we have no prior knowledge of the actual pdf and all 
continuous values in some range seem equally likely (Prob. 2.69). 


E. Exponential Distribution: 
Ar.v. X is called an exponential r.v. with parameter A (>0) if its pdf is given by 
de*® x > 0 
= 2.48 
ful) ; yO (2.48) 
which is sketched in Fig. 2-8(a). The corresponding cdf of X is 
l-e* x20 


2.49 
0 x <0 ( ) 


F x) = { 
which is sketched in Fig. 2-8(5). 


Iho Fy) 


() 


Fig. 2-8 Exponential distribution. 
The mean and variance of the exponential r.v. X are (Prob. 2.32) 


My = E(X) = (2.50) 


Poe ed 


oy? = Var(X) = x (2.51) 


The most interesting property of the exponential distribution is its “memoryless” property. By 
this we mean that if the lifetime of an item is exponentially distributed, then an item which has been 
in use for some hours is as good as a new item with regard to the amount of time remaining until the 
item fails. The exponential distribution is the only distribution which possesses this property (Prob. 
2.53). 
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F. Normal (or Gaussian) Distribution: 


A r.v. X is called a normal (or gaussian) r.v. if its pdf is given by 


Sxl) = en Weta?) 2.52 
x ino (2.52) 
The corresponding cdf of X is 
Fy) = [ everwmaen qe Lee ae (2.53) 
= = —_ e ! 33 
" /2n0 —o 2n x 


This integral cannot be evaluated in a closed form and must be evaluated numerically. It is conve- 
nient to use the function ®(z), defined as 


1 Zz 
Oz) = | e Pl? dk (2.54) 
Vf 20 de 
to help us to evaluate the value of F(x). Then Eq. (2.53) can be written as 
F,(x) = af? — “) (2.55) 
Note that 
@O(—z) = 1 — Wz) (2.56) 


The function ®(z) is tabulated in Table A (Appendix A). Figure 2-9 illustrates a normal distribution. 


fled Fy 


{a) (p) 


Fig. 2-9 Normal distribution. 


The mean and variance of the normal r.v. X are (Prob. 2.33) 


By = E(X) = p (2.57) 
ay? = Var(X) = 07 (2.58) 


We shall use the notation N(y; a7) to denote that X is norma] with mean p and variance o”7. A 
normal r.v. Z with zero mean and unit variance—that is, Z = N(O: 1}—1s called a standard normal r.v. 
Note that the cdf of the standard normal r.v. is given by Eq. (2.54). The normal r.v. is probably the 
most important type of continuous r.v. It has played a significant role in the study of random pheno- 
mena in nature. Many naturally occurring random phenomena are approximately normal. Another 
reason for the importance of the normal r.v. is a remarkable theorem called the central limit theorem. 
This theorem states that the sum of a large number of independent r.v.’s, under certain conditions, 
can be approximated by a normal r.v. (see Sec. 4.8C), 
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2.8 CONDITIONAL DISTRIBUTIONS 
In Sec. 1.6 the conditional probability of an event 4 given event B is defined as 


P(A « B) 


P(A|B) = Fy 


P(B) > 0 


The conditional cdf F(x |B) of a r.v. X given event B is defined by 
P{(X < x) 0 BS 
P(B) 


The conditional cdf F,(x|B) has the same properties as F,(x). (See Prob, 1.37 and Sec. 2.3.) In 
particular, 


F(x|B) = P(X <x|B)= (2.59) 


Fy(— 0c | B) =0 Foo |B) =1 (2.60) 

P(a< X < b| B) = F,(b| B) — Fyla| B) (2.61) 
If X is a discrete r.v., then the conditional pmf p,(x, | B) is defined by 
P{(X = x,) 0 B} 


Px(x,|B) = P(X = x,| B) = PLB) (2.62) 
If X is a continuous r.v., then the conditional pdf f(x | B) is defined by 
dF y(x|B 
fx} B) = ein (2.63) 
x 
Solved Problems 
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2.1. Consider the experiment of throwing a fair die. Let X be the r.v. which assigns 1 if the number 
that appears is even and 0 if the number that appears is odd. 


(a) What is the range of X? 
(b) Find P(X = 1) and P(X = 0). 
The sample space § on which X is defined consists of 6 points which are equally likely: 
S = {l, 2,3, 4, 5, 6} 
(a) The range of X is Ry = {0, 1}. 
(b) (X = 1) = {2, 4, 6}. Thus, P(X = 1) = 2 = 4. Similarly, (X = 0) = {1, 3, 5}, and P(X = 0) = 4. 


2.2, Consider the experiment of tossing a coin three times (Prob. 1.1). Let X be the r.v. giving the 
number of heads obtained. We assume that the tosses are independent and the probability of a 
head is p. 


(a) What is the range of X? 
(b) Find the probabilities PLX = 0), P(X = 1), P(X = 2), and P(X = 3). 
The sample space S on which X is defined consists of eight sample points (Prob. 1.1): 
S = {HHH, HHT,..., TTT} 
(a) The range of X is Ry = {0, 1, 2, 3}. 
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2.35 


2.4, 


(b) If P(H) = p, then P(T) = 1 — p. Since the tosses are independent, we have 


P(X = 0)= P[{TTT}] =(1 — pp 

P(X = |) = P[SHTT}] + P[{THT}] + P[{TTH}] = 3(1 — p)*p 
P(X = 2) = P[{HHT}] + P[{HTH}] + P[{THH}] = 3(1 — p)p? 
P(X = 3) = P[{HHH}] =p? 


An information source generates symbols at random from a four-letter alphabet {a, b, c, d} with 
probabilities P(a) = 4, P(b) = 4, and P(c) = P(d) = 4. A coding scheme encodes these symbols 
into binary codes as follows: 


a 0 

b 10 
c 110 
d 111 


Let X be the r.v. denoting the length of the code, that is, the number of binary symbols (bits). 


(a) What is the range of X? 

(b) Assuming that the generations of symbols are independent, find the probabilities P(X = 1), 
P(X = 2), P(X = 3), and P(X > 3). 

(a) The range of X is Ry = {1,2, 3}. 

(b) P(X = 1) = P[{a}] = Pla) 
P(X = 2) = P[{b}] = Pb) = 4 
P(X = 3) = P[{e, d}] = P(c) + Pid) =4 
P(X > 3) = P(@) =0 


Consider the experiment of throwing a dart onto a circular plate with unit radius. Let X be the 

r.v. representing the distance of the point where the dart lands from the origin of the plate. 

Assume that the dart always lands on the plate and that the dart is equally likely to land 

anywhere on the plate. 

(a) What is the range of X? 

(b) Find (i) P(X < a) and (ii) Pia < X <b), wherea <b <1. 

(a) The range of X is Ry = {x:0< x < I}. 

(b) (i) (X <a) denotes that the point is inside the circle of radius a. Since the dart is equally likely to fall 
anywhere on the plate, we have (Fig. 2-10) 


2 
PIX <a)=—5 = @ 
m1 


(it) {a < X <b) denotes the event that the point is inside the annular ring with inner radius a and 
outer radius b. Thus, from Fig. 2-10, we have 
nh? — a’) 
7 mi? 


Pia <X <b) = b*? —q@ 
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2.5. 


Verify Eq. (2.6). 
Let x; <x,. Then (X < x,) is a subset of (¥ < x,); that is, (X < x,) <(X < x,). Then, by Eq. (1.27), 
we have 


P(X <x.) < P(X < x) or Fy(x1) S Fy(%2) 
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Fig. 2-10 


Verify (a) Eq. (2.10); (b) Eq. (2.11); (c) Eq. (2.12). 
(a) Since (X <b) =(X <a) Ula < X < byand(X <a) n(a< X <b) = G, we have 
P(X <b) = P(X <a)+ Pla< X <b) 
or F y(b) = Fy(a) + Pla< X <b) 
Thus, Pla < X <b) = Fy(b) — F,fa) 
(b) Since (X < a) U(X > a) = Sand (X <a) 4 (X > a) = @, we have 
P(X <a) + P(X >a)= P(S)=] 


Thus, P(X > a)=1-— P(X < a) = 1 — Fyf{a) 
{c) Now P(X <b) = Pl[lim X <b —e} =lim P(X <b — 8) 
60 c-0 
r>O0 n>O0 


= lim Fy(b — 2) = Fb”) 
r+0 
£20 


Show that 
(a) Plas X <b)= P(X =a) + F,(b) — Fla) 
(b) Pla< X < b) = Fy(b) — Fyla) — P(X = b) 
(c) Plas X <b) = P(X =a) + F,y(b) — Fy(a) — P(X = 5) 
(a) Using Eqs. (1.23) and (2.10). we have 
Plas X <b)=P[(X =a) U(a< X <b) 
= P(X =a) + Pla< X <b) 
= P(X =a) + Fy(b) — Fyla) 
(b) We have 
Pla< X <b)= P[la< X <b) U(X =)b)] 
= Pla< X <b) + P(X =b) 


(CHAP 2 


(2.64) 
(2.65) 
(2.66) 
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Again using Eq. (2./0), we obtain 
P(ia< X <b)= P(a< X <b) — P(X =b) 
= Fy(b) — Fyla) — P(X = b) 
(c) Similarly, Plas X <b)=P[a<X <b) U(X =)d)] 
= Pla< X <b) + P(X = b) 
Using Eq. (2.64), we obtain 
Plas X <b) =Pla< X <b)— P(X = b) 
= P(X =a) + Fy{b) — Fy(a) — P(X = 5) 


2.8. Let X be the r.v. defined in Prob. 2.3. 
(a) Sketch the cdf F(x) of X and specify the type of X. 
(b) Find (i) P(X < 1), (t) PL < X < 2), (iii) P(X > 1), and (iv) P(] < X < 2). 
(a) From the result of Prob. 2.3 and Eq. (2./8), we have 


0 x<l 
i l<x<2 
F(x) = P(X <x) = 47 ~ 
xO) = PX SX) = 7 4 2<x<3 
l x23 


which is sketched in Fig. 2-11. The r.v. X is a discrete r.v. 
(b) (i) Wesee that 


(ii) By Eq. (2.10), 


(iii) By Eq. (2.11), 


(iv) By Eq. (2.64), 
PIL < X <2) = P(X =1)4+ Fy(2Q)- Fy) =F +3-3= 


Plu 


FY 


Fig. 2-11 


2.9. - Sketch the cdf Fy(x) of the r.v. X defined in Prob. 2.4 and specify the type of X. 


From the result of Prob. 2.4, we have 


which is sketched in Fig. 2-12. The r.v. X is a continuous r.v. 
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FAN) 


Fig. 2-12 
2.10. Consider the function given by 
0 x<O0 
F(x)=4xt+4 Osx<4 
1 x24 


(a) 
(b) 


(c) 
(a) 


(b) 


(c) 


Sketch F(x) and show that F(x) has the properties of a cdf discussed in Sec. 2.3B, 

If X is the r.v. whose cdf is given by F(x), find (i) P(X < 4), (ii) P(O < X < 4), (tii) P(X = 0), 
and (iv) POO < X < 4). 

Specify the type of X. 

The function F(x) is sketched in Fig. 2-13. From Fig. 2-13, we see that 0 < F(x) <1 and F(x) is a 


nondecreasing function, F(—0o) = 0, F(oo) = |, F(0) = $, and F(x) is continuous on the right. Thus, 
F(x) satisfies all the properties [Eqs. (2.5) to (2.9] required of a cdf. 
(i) We have 


PX <a)=FG)=44+4=3 


(ii) By Eq. (2.10), 
PO<X <$)=FQ)~FO)=3-4=4 
(iii) By Eq. (2.12), 
P(X = 0) = P(X <0) — P(X <0) = F(0)— FO-)=4-0=4 
(iv) By Eq. (2.64), 
POO < X < 4)= P(X =0)+ FG) - FO) =$4+3-5=3 


The r.v. X is a mixed r.v. 


Fyty) 
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2.11. Find the values of constants a and b such that 


is a valid cdf. 


To satisfy property | of Fy(x) [0 < Fy(x) < 1], we must have 0 < a < 1 and b > 0. Since b > 0, pro- 
perty 3 of Fy{x) [Fy(oc) = 1] is satisfied. It is seen that property 4 of Fy(x) [Fy(~— 0) = 0} is also satisfied. 
For0<a< 1 and b > 0, F(x) is sketched in Fig. 2-14. From Fig. 2-14, we see that F(x) is a nondecreasing 
function and continuous on the right, and properties 2 and 5 of J’,(x) are satisfied. Hence, we conclude that 
F(x) given is a valid cdf if0 < a < | and b > 0. Note that if a = 0, then the r.v. X is a discrete r.v.; if a = 1, 
then X is a continuous r.v.; and if0 <a < 1, then X is a mixed r.v, 


Pylad 


DISCRETE RANDOM VARIABLES AND PMEF’S 
2.12. Suppose a discrete r.v. X has the following pmfs: 
px(1) = 3 Px(2) = 4 Px(3) = & pxA4) = 5 
(a) Find and sketch the cdf F(x) of the rv. X. 
(b) Find (i) P(X < 1), (ii) P< X < 3), (ill) P( < X < 3). 
(a) By Eq. (2.18), we obtain 


0 x<l 

4 }<x<2 
Fx) = P(X <x) = 73 2<x<3 

zt 3ex<4 

l x>4 


which is sketched in Fig, 2-15. 
(b) (i) By Eq, (2.12), we see that 
P(X < l= Fy} =0 
(ii) By Eq. (2.10), 
PU<X <3)=F,Q)—Fy)=3-3=38 
(ili) By Eq. (2.64), 


PU SX <3)=P(X =1)+ Fx8)- Fy =3+9-4=3 


Ni 
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Fig. 2-15 


2.13. (a) Verify that the function p(x) defined by 


(x) G)® x =0,1,2,... 
xp= 
P 0 otherwise 


is a pmf of a discrete r.v. X. 
(b) Find (i) P(X = 2), (ii) P(X < 2), (iii) P(X = 1). 


(a) Teis clear that 0 < p(x) < J and 


Thus, p(x) satisfies all properties of the pmf [Eqs. (2.15) to (2.17)] of a discrete r.v. X. 
(b) (i) By definition (2.14), 
P(X = 2) = (2) = 34)? = & 
(ii) By Eq. (2./8), 


2 
PX <2)= Vp = Rl +4+ = 8 
i=0 


(iii) By Eq. (1.25), 
PIX = 1)=1— P(X =0)=1-p0)=1-}F=4 


2.14. Consider the experiment of tossing an honest coin repeatedly (Prob. 1.35). Let the r.v. X denote 
the number of tosses required until the first head appears. 


(a) Find and sketch the pmf p,{x) and the cdf F(x) of X. 
(b) Find (i) P< X < 4), (ii) P(X > 4). 
(a) From the result of Prob. 1.35, the pmf of X is given by 
Pxlx) = pxlk) = P(X =k = (3) k= 1,2... 
Then by Eq. (2.18), 


Fx = PX <x) = Yeh = > GS 
k=1 k=1 


Fy(x) = 


or 
0 x<l 
4 l<x<2 
; 2<x<3 


: n<x<nt+] 


—_ 
| 
_~ 
Np 
— 
a 


CHAP. 2] RANDOM VARIABLES 55 


These functions are sketched in Fig, 2-16. 
(6) (i) By Eq. (2.10), 
Pl <X <4)=Fy4)—Fy(l)= 8-4=%5 
(ii) By Eq. (1.25), 
PX >4)=1-P(X <4 =1-F,4=1-H=a% 


Pylon) FAs) 


2.15. Consider a sequence of Bernoulli trials with probability p of success. This sequence is observed 
until the first success occurs. Let the r.v. X denote the trial number on which this first success 
occurs. Then the pmf of X is given by 


p(x) = P(X =x)=(1—p)*'p x =1,2,... (2.67) 


because there must be x — 1 failures before the first success occurs on trial x. The r.v. X defined 
by Eq. (2.67) is called a geometric r.v. with parameter p. 


(a) Show that p,(x) given by Eq. (2.67) satisfies Eq. (2.17). 
(b) Find the cdf F(x) of X. 


(a) Recall that for a geometric series, the sum is given by 


Sar= Sata red (2.68) 
n=0 n=l l-r 
Thus, 
=F r= YU- fo =P PL, 
& Px(x) Dali py p)''p 1-(l1—p) p 
(b) Using Eq. (2.68), we obtain 
= - (1 — p)*p 
PX >k= 1 — p)''p = ——_—— _ = (1 — p)k 2.69 
(X > k) |! p)''p l-d-p (1 ~ p) (2.69) 
Thus, PX sk)=1-P(X > =1-(1 — ph (2.70) 
and Fx) =P(X <x)=t—-(l—-p x=1,2,... (2.71) 


Note that the r.v. X of Prob. 2.14 is the geometric r.v. with p == 4. 


2.16. Let X be a binomial r.v. with parameters (n, p). 
(a) Show that p,(x) given by Eq. (2.36) satisfies Eq. (2.17). 
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(b) Find P(X > l)ifn =6andp=0.1. 


(a) Recall that the binomial expansion formula is given by 


(a +b) = (‘ator (2.72) 


k=0 


Thus, by Eq. (2.36), 


Yextk)= ) (‘eu — py *=(pt+1—pl alt= 
k=0 


(b) Now P(X > Ib=1— P(X =0) - P(X = 1) 


1- (‘)o.ntosy - (Fo.nr@sy 


1 ~ (0.9)® — 6(0.10.9)5 = 0.114 


2.17. Let X be a Poisson r.v. with parameter A. 


(a) Show that p,(x) given by Eq. (2.40) satisfies Eq. (2.17). 
(b) Find P(X > 2) with 2 = 4. 


(a) By Eq. (2.40), 


x % A‘ 
¥ pk) =e UY ae fea] 
k=0 k=0 
(b) With A = 4, we have 
44 
Ppylk) = e “a 
2 
and P(X < 2)= ¥ px(k) =e *(L + 4 + 8) & 0.238 
k=0 
Thus, P(X > 2) =1— P(X < 2) 21 — 0.238 = 0.762 


CONTINUOUS RANDOM VARIABLES AND PDF’S 
2.18. Verify Eq. (2.19). 
From Eqs. (1.27) and (2.10), we have 
P(X =x) S P(x —e< X <x) = Fy{x) — Fy(x — 2) 


for any ¢ > 0. As F,(x) 1s continuous, the right-hand side of the above expression approaches 0 as ¢— 0. 
Thus, P(X = x) = 0. 


2.19. The pdf of a continuous r.v. X is given by 


; O<x<1 
fidx) = 44 l<x<2 
0 otherwise 


Find the corresponding cdf F,{x) and sketch J,(x) and F(x). 
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By Eq. (2.24), the cdf of X is given by 
0 x <0 


[se=3 O<x<1 
¢) 
1 x 
Fy(x) = [sacs [acm ax—4 l<x<2 
0 1 


1 2 
[sae [pace 2<x 
(0) L 


The functions f,(x) and F(x) are sketched in Fig. 2-17. 


Sx) 


Fig. 2-17 
2.20. Let X be a continuous r.v. X with pdf 
kx O<x<l 
Sx) = ‘0 otherwise 


where k is a constant. 

(a) Determine the value of k and sketch fy(x). 

(b) Find and sketch the corresponding cdf F(x). 
(c) Find P(t < X < 2). 

(a) By Eq. (2.2/), we must have k > 0, and by Eq. (2.22), 


[™ dx = ; =1 
Thus, k = 2 and 
2x 0<x<! 
xe) = {0 otherwise 
which is sketched in Fig. 2-18(a). 
(b) By Eq. (2.24), the cdf of X is given by 
0 x<0 


Fy(x) = [ora O<x<!1 
0 


[ease }<x 
0 


which is sketched in Fig, 2-18(b). 
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Sdxy FY) 


(c) By Eq. (2.25), 
Pk < X <2) = Fy(2)— Fx) = 1- GP = 48 


2.21. Show that the pdf of a normal r.v. X given by Eq. (2.52) satisfies Eq. (2.22). 
From Eq. (2.52), 


eo 
e@7 *~HPA(202) gy 


” 1 
dx = 
[fo * Vf 2nd JLo 


Let y = (x — p)(,/20). Then dx = \/2.6 dy and 
1 


wo J wo 
e 7 (> w)2(202) gy — { e”” dy 
/ 200 { Jn x 


Let [ e’dy=I 


Then P= [| eo? ax] | eo”? | = [ e+) dy dy 


Letting x = rcos@ and y = rsin @ (that is, using polar coordinates), we have 


2n Re) of 
r= | [ er dr do=2n | e"rdr=n 
0 0 10 


Thus, l= e dy = /n (2.73) 
and [ Sy(x) dx iy e”” dy ! Jn 1 
= — = —. TrC= 
-w * Jn ~o Jn 


2.22. Consider a function 


1 
f(x) = ete —-oO<x<@ 
m 


Find the value of a such that f(x) is a pdf of a continuous r.v. X. 
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f(x = I el etx 7a) I er stn xt lla tam 1/4) 
R n 


-|- ae 
Jn 


If f(x) is a pdf of a continuous r.v. X, then by Eq. (2.22), we must have 


ow x 1 
| fhe) de = 1 | ae HZ fy = | 


en 14 


1 
Now by Eq. (2.52), the pdf of N(4; 4) is —= e7 "7". Thus, 
i 


| ee FAY dy = | and | f(x) dx = eT @7 V4) = |] 


-® nu x 


from which we obtain a = 4. 


2.23. Ar.v. X is called a Rayleigh r.v. if its pdf is given by 


Xo x2/1202) 
—>e x>0 

Sx(x) = §0? (2.74) 
0 x<0 


(a) Determine the corresponding cdf F(x). 
(b) Sketch f(x) and Fy(x) foro = 1. 


(a) By Eq. (2.24), the cdf of X is 
F(x) = | £ e P20) JE x > 0 
lp o 


Let y = €7/(267), Then dy = (1/o7)é dé, and 


x2/{2a2) 
Fy(x) = | ee? dy=1 — e720?) (2.75) 
Oo 
(b) With o = 1, we have 

xe 2 x>0 
I(x) = ‘, x <0 
le 7? x20 
and F(x) = ‘0 x<0 


These functions are sketched in Fig. 2-19. 


2.24. Ar.v. X is called a gamma r.v. with parameter (a, 4) (a > O and A > 0) if its pdf is given by 
Ax a-) 
Ae” “*(Ax) x>0 


fioy=) Te - 076) 


where I(a) is the gamma function defined by 


re) = | exttdx  a>0 (2.77) 
0 
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Le) FAX) 
0.6 ! 
0.8 
04 
0.6 
04 
0.2 
0.2 
0 0 
0 ) 2 3 x 0 \ 2 


(a) . (6) 
Fig. 2-19 Rayleigh distribution with ¢ = 1. 


(a) Show that the gamma function has the following properties: 
1. V(a + 1) = aD (a) a>d 
2 TA+D=k! k (> 0): integer 
3. Ta)=/n 

(b) Show that the pdf given by Eq. (2.76) satisfies Eq. (2.22). 

(c) Plot f(x) for (a, 4) = (1, 1), (2, 1), and (5, 2). 

(a) Integrating Eq. (2.77) by parts (u = x*~', dv = e~* dx), we obtain 


T(a) = —e7*x*"! 


+| e7*(a — 1)x*7? dx 
0 ‘0 


=(a—1) [lene dx = (a — L)I(a — 1) 
0 


Replacing a by « + 1 in Eq. (2.81), we get Eq. (2.78). 
Next, by applying Eq. (2.78) repeatedly using an integral value of «, say a = k, we obtain 


T(k + 1) = kT (ky = kk — Yk — 1) = kk — 1) + (2)7(1) 
Since Ti) = [Pes dx =1 
0 
it follows that T(k + 1) = k!. Finally, by Eq. (2.77), 
m4) = [fence dx 
Let y = x"?, Then dy = 4x7"? dx, and 
(4) =2 [er dy = [ e” dy=/n 
0 — 0 


in view of Eq. (2.73). 
(b) Now 


ear ax 


«© de~**(Ax)t-! Ae [ 


_ em -[ Ta “Teh 


Let y = Ax. Then dy = A dx and 


i” Sales) aa ery! dy =—*_ ra) = 
- oF Taya® 
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(2.78) 
(2.79) 
(2.80) 


(2.81) 
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fax) 


2 3 


Fig. 2-20 Gamma distributions for selected values of a and A. 


(c) The pdf’s f(x) with (a, 4) = (1, 1), (2, 1), and (5, 2) are plotted in Fig, 2-20. 
Note that when « = I, the gamma r.v, becomes an exponential r.v. with parameter 4 [Eq. (2.48)]. 
MEAN AND VARIANCE 


2.25. Consider a discrete r.v. X whose pmf is given by 


1 
_ 33 x=-— 1, 0, 1 
Px) = ‘; otherwise 
Plot p(x) and find the mean and variance of X. 


(b) Repeat (a) if the pmf is given by 


(x) = 4 x = —2,0,2 
PRI VO otherwise 
The pmf p,(x) is plotted in Fig. 2-21(a). By Eq. (2.26), the mean of X is 


By = E(X) = #{(-14+0+)=0 
By Eq. (2.29), the variance of X is 


oy? = Var(X) = E[(X — py)?) = E(X?) = ${(— 1)? + 0) + (1)"] = 3 
(b) The pmf p,({x) is plotted in Fig. 2-21(5). Again by Eqs. (2.26) and (2.29), we obtain 


dx = E(X) = 4(-2+0+ 2) =0 


px) Py) 
| I 
3 3 
-2 -1 9) \ 2 x -2 -l 0 ] 2 x 
(a) (0) 


Fig. 2-21 


62 


2.26. 


2.27. 


2.28. 
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oy? = Var(X) = $[(—2)? + (07 + 27] = 8 


Note that the variance of X is a measure of the spread of a distribution about its mean. 


Let a r.v. X denote the outcome of throwing a fair die. Find the mean and variance of X. 
Since the die is fair, the pmf of X is 
Px(x) = pylk) = % k= 1,2,...,6 
By Eqs. (2.26) and (2.29), the mean and variance of X are 


By = E(X) = 4(1 + 243444546 =7=35 
oy? =e — 9? + (2-97 +03 -Gr% +(4-FP +6 -9" +6-FPI=H8 


Ni 


Alternatively, the variance of X can be found as follows: 
AXV=UP +P 437244457 467 = 4 
Hence, by Eq. (2.31), 
oy? = E(X’) - [E(X)P = 4 - GY =#8 


Find the mean and variance of the geometric r.v. X defined by Eq. (2.67) (Prob. 2.15). 


To find the mean and variance of a geometric r.v. X, we need the following results about the sum of a 
geometric series and its first and second derivatives. Let 


gir) = Sut=—“— fr} <1 (2.82) 
n=O l-r 
1 49¢r)_ = n-i _  4@ 
Then g(r) = ro Dane -7T-» (2.83) 
” yr) = 3 2a 
f= = Zante — Dr = (2.84) 


By Eqs. (2.26) and (2.67), and letting q = 1 — p, the mean of X is given by 
by = E(X) = Y xg*'p = = 5 = (2.85) 


where Eq. (2.83) is used with a = pandr = q. 
To find the variance of. X, we first find ELX({X ~ 1)]. Now, 


ELX(X — DN) = Yo xix — Dat 'p = ¥ pax(x — Ng? 
x=l x=2 
2pq 2pq__2q_ Al — p) 
= att. (2,86) 
(I-qgP pp? p 
where Eq. (2.84) is used with a = pq andr = q. 
Since E[X(X — 1)] = E(X? — X) = E(X?) — E(X), we have 
Al—p) 1 2 
F(X?) = ELX(X — + (xy = UO zoe (2.87) 
p pp 
Then by Eq. (2.3/), the variance of X is 
2- I ot- 
oy? = VarlX) = E(X2) — [E(X)]? = = os 7 (2.88) 


Let X be a binomial r.v. with parameters (n, p). Verify Eqs. (2.38) and (2.39). 
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By Eqs. (2.26) and (2,36), and letting q = | — p, we have 


E(X) = ¥ kpy(k) = Kota - 
k=0 k=0 


n! 
Zs (n — k)Ik! 
° (n — 1)! 


=" ao bik oD? 


k anak 


Pq 


k-l pak 


Letting i = k — 1 and using Eq. (2.72), we obtain 


~ _(n—1)! 
E(X) = ny, .(n—-1—)lit 


=n, (" . pant 
i=0 J 


= np(p + q)""! = np(ly""* = np 


Pq 


Next, E[X(X — I] = S kk — Lp x(k) = Sak - 1 (ota -* 
k=0 
“ kk n! k nk 
= Dk DG 4 
_ (n — 2)! k-2>n-k 
= nin WP Oka 
Similarly, letting i = k — 2 and using Eq. (2.72), we obtain 
~ 2)! ; 
ELX(X — IJ] = n(n — 1p? x " 5g 


“(n—-2—d!i! 
= n(n — bey, (" / “gr 
= n(n ~ l)p*(p + g)"~? = n(n — 1)p? 
Thus, E(X?) = ELX(X — 1)] + E(X) = n(n — lp? + np (2.89) 
and by Eq. (2.31), 
oy? = Var(x) = n(n — 1)p? + np — (np)? = np(1 — p) 


2.29. Let X be a Poisson r.v. with parameter 4. Verify Eqs. (2.42) and (2.43). 
By Eqs. (2.26) and (2.40), 


mw k 
ie = —_= aA 
E(X) » Kpy(k) = She O+ - ko! 
a ghd 
= Ae? evs ee ad 
* & kD! De 
ou A A Ako? 
— = _— —_= 257 
Next, E[X(X — 1] Mk tet 5 = Pe Lamm eo 


i 


=PeAy A 8 j2e-tet = 2? 
i-0 8: 


Thus, E(X2) = ELX(X —1)] + E(X) =a +4 (2.90) 
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2.30. 


2.31. 


2.32. 
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and by Eq. (2.31), 
oy? = Var(X) = E(X?) —[E(X)P = 42 44-22 =a 


Find the mean and variance of the r.v. X of Prob. 2.20. 
From Prob. 2.20, the pdf of X is 


2x O<x<! 
0 otherwise 


Sx) = { 


By Eq. (2.26), the mean of X is 


L Pas 
ty = E(X) = | x(2x) dx = 25 


0 


l 
— 2 
“3 


By Eq. (2.27), we have 


1 


Ul 
Nie 


n x4 
E(X?) = | x7(2x) dx = 2 — 
0 4 


1) 


Thus, by Eq. (2.31), the variance of X is 
oy? = War(X) = E(X?) — [E(X)? = 4-3 =a 


Let X be a uniform r.v. over (a, b). Verify Eqs. (2.46) and (2.47). 
By Eqs. (2.44) and (2.26), the mean of X is 


b 1 J x? 


= E(X)= 
Bx (4) ie “hoa 2 


a 


By Eq. (2.27), we have 
1 3 


dx = —— 


a = tp? b 2 
a rc $(hb? + ab + a’) 


b 
E(X?) = { x? 
Thus, by Eq. (2.31), the variance of X is 


oy? = Var(X) = E(X’) — [E(X)]? 
= 3(h? + ab + a*) — 3(b + a)? = y(b — a)? 


jb 
ja 


Let X be an exponential r.v. X with parameter 4. Verify Eqs. (2.50) and (2.51). 
By Eqs. (2.48) and (2.26), the mean of X is 


By = E(X) = | xde~** dx 


0 


Integrating by parts (u = x, dv = Ae~** dx) yields 


E(X) = —xe7* 


Next, by Eq. (2.27), 
E(X?) = { x2de“* dx 
0 
Again integrating by parts (u = x?, dv = de” ** dx), we obtain 


E(X7) _ — x%e74* 


wy ines a 2 
+2 xe” “dx => 
0 0 A 


[CHAP 2 


(2.91) 


(2.92) 
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Thus, by Eq. (2.37), the variance of X is 


2 a 
oy? = EX?) — (EX) = 4 = (3) _i 


2.33. Let X = N(u; 07). Verify Eqs. (2.57) and (2.58). 


Using Eqs. (2.52) and (2.26), we have 


by = E(X) = 


ow 
| xe uRi(292) dy 


1 


Writing x as (x — yz) + pw, we have 


1 ou 
E(X) = | (x — pe” M72 dy ty 


Pine 


Letting y = x — yin the first integral, we obtain 


if x 
eo W720) By 
/2n6 


Vv 


I ay wy 
E(X) = | ye" MIG) dy + p | Sx) dx 


/2n6 mar ol 
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The first integral is zero, since its integrand is an odd function. Thus, by the property of pdf Eq. (2.22), we 


get 
Hy = E(X) =p 
Next, by Eq. (2.29), 


oy? = E[(X - w= 


wa 
| (x — pre” HRI) dx 


l 
J2n6 J_ 
From Eqs. (2.22) and (2.52), we have 


x 
| @7 7 uP202) dy — g./2n 
we 


Differentiating with respect to o, we obtain 


* (x= BP waa 
| —— | a)ay(2 Ydx = /2n 


Multiplying both sides by o?/./2n, we have 
| 


ine 


Thus, 6? = Var(X) = o? 


oc 
| (x _ pyre + (4 )2 (202) dx = o? 
~ 


2.34. Find the mean and variance of a Rayleigh r.v. defined by Eq. (2.74) (Prob. 2.23). 


Using Eqs. (2.74) and (2.26), we have 


au 


x. I x - 
x = e x2/(202) dx = -_ xe x2 4(2a2) dx 
o a” Jo 


m= 8x) = [ 


0 
Now the variance of N(0; 67) is given by 
1 


ine 


7% 
| xe" P20) dy = gt 
— 


66 


2.35. 


2.36. 


RANDOM VARIABLES [CHAP 2 


Since the integrand is an even function, we have 


1 


Jf 2ne 
* th 
or | xte7 207) dy — 4 /2n0° = 5 a 
0 


1 n th 
Then Hy = E(X) = a FE o= fi o (2.93) 


x 
| xte~ P20) dy = 1g? 
0 


bo) | a 
Next, E(X?) = | x? = e720) dy = | xte@7 82202) dy 
0 0 oe Jo 
Let y = x?/(207). Then dy = x dx/o’, and so 
E(X?*} = 267 | ye * dy = 20 (2.94) 
0 
Hence, by Eq. (2.31), 
oy? = E(X2) — [E(X)P = (2 - a = 0.4290? (2.95) 


Consider a continuous r.v. X with pdf f,(x). If f(x) = 0 for x < 0, then show that, for any a > 0, 


P(X >a)< ms (2.96) 


where pty = E(X). This is known as the Markov inequality. 
From Eq. (2.23), 


P(X 2 a)= [a0 dx 


Since f,(x) = 0 for x < 0, 


py = BUX) = | “ afylx) dx 2 [00 dx >a [2009 dx 
10. a a 


Hence, { Syl) dx = P(X > a) <P 
la a 


For any a > 0, show that 


2 
oa 
PX = yl 2 a) SP (2.97) 


where yy and o,? are the mean and variance of X, respectively. This is known as the Chebyshev 
inequality. 


From Eq. (2.23), 


ux7a a: 


Sx{x) dx = | Sx) dx 


|x paxl 2a 


Sxlx) dx + | 


uxta 


PAX = wiz a= | 


—m 


By Eq. (2.29), 


oy = | (x — py)fylx) dx > | 


ae Ix--uxlze 


(x = py)°fxx) dx 2 a? { Ix(x) dx 


|x -uxl2@ 
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2 


Hence, | fulx) dx < or PULX — py| 2a) < 
[x -wx|za a? a 


Note thal by setting « = ko, in Eq. (2.97), we obtain 


! 
PIX — yl 2 kox) SF5 (2.98) 


Equation (2.98) says that the probability that a r.v. will fall k or more standard deviations from its mean is 
<1/k?. Notice that nothing at all is said about the distribution function of X. The Chebyshev inequality is 
therefore quite a generalized statement. However, when applied to a particular case, it may be quite weak. 


SPECIAL DISTRIBUTIONS 


2.37. 


2.38. 


2.39. 


A binary source generates digits 1 and 0 randomly with probabilities 0.6 and 0.4, respectively. 


(a) What is the probability that two !s and three Os will occur in a five-digit sequence? 
(b) What is the probability that at least three Is will occur in a five-digit sequence? 


(a) Let X be the r.v. denoting the number of Is generated in a five-digit sequence, Since there are only two 
possible outcomes (1 or 0), the probability of generating 1 is constant, and there are five digits, it is 
clear that X is a binomial r.v. with parameters (n, p) = (5, 0.6). Hence, by Eq. (2.36), the probability 
that two Is and three Qs will occur in a five-digit sequence is 


P(X = 2) = (3)(0.6)7(0.4)? = 0.23 
(b) The probability that at least three Is will occur in a five-digit sequence is 


P(X > 3)=1- P(X <2) 


2 
where =2 at Joos *= 0317 


Hence, P(X > 3) = 11-0317 = 0.683 


A fair coin is flipped 10 times. Find the probability of the occurrence of 5 or 6 heads. 


Let the r.v. X denote the number of heads occurring when a fair coin is flipped 10 times. Then X is a 
binomial r.v. with parameters (n, p) = (10, 4). Thus, by Eq. (2.36), 


6 10 GE) 
P(S < X <6)= -]f[- = 0.451 
Osx <6) LNG 2 


Let X be a binomial r.v. with parameters (n, p), where 0 < p < 1. Show that as k goes from 0 to 
n, the pmf p,(k) of X first increases monotonically and then decreases monotonically, reaching its 
largest value when k is the largest integer less than or equal to (n + !)p. 


By Eq. (2.36), we have 


@ar — py 

px(k)_ ___\k _ (nak + Ip (2.99) 

pxk — 1) ( n a = pyre! k(1 — p) . 
k-1 


Hence, py(k) > py(k — 1) if and only if (2 —k 4+ 1)jp > k(1 — p) or k <(n + 1)p. Thus, we see that py(k) 
increases monotonically and reaches its maximum when & is the largest integer less than or equal to 
(n + 1)p and then decreases monotonically. 


68 


RANDOM VARIABLES 


[CHAP 2 


2.40. Show that the Poisson distribution can be used as a convenient approximation to the binomial 


2.41. 


2.42. 


distribution for large n and small p. 
From Eq. (2.36), the pmf of the binomial r.v. with parameters (n, p) is 
n(n - [Yin — 2)+--(n—k + 1) 
pits 


px(k) = (Tru 6S py* 


Multiplying and dividing the right-hand side by n*, we have 


k} 


If we let n = co in such a way that np = A remains constant, then 


(0-3) a 

n n n ao 

(27° -(-M-f aver 
n n n aw 


where we used the fact that 


Hence, in the limit as n > co with np = 4 (and as p = A/n > 0), 
@ixt Sspyc* oat ete np=A 


Thus, in the case of large n and small p, 
k 


BON ago Nik -,% = 
(j)pte py “xe ki np=A 


which indicates that the binomial distribution can be approximated by the Poisson distribution. 


A noisy transmission channel has a per-digit error probability p = 0.01. 


(a) Calculate the probability of more than one error in 10 received digits. 
(b) Repeat (a), using the Poisson approximation Eq. (2.100). 


(2.100) 


(a) It is clear that the number of errors in 10 received digits is a binomial r.v. X with parameters (n, p) = 


(10, 0.01). Then, using Eq. (2.36), we obtain 
P(X > 1)=1— P(X =0)— P(X = 1) 


eee (1 yo.on%.90)" = (PJoonrosa 


1 
= 0.0042 
(b) Using Eq. (2.100) with 4 = np = 10(0.01) = 0.1, we have 
P(X > 1)=1— P(X = 0)— P(X = 1) 
(0.1)° 9.1 (0.1)! 
on I! 


=1-— eo! 


= 0.0047 


The number of telephone calls arriving at a switchboard during any 10-minute period is known 


to be a Poisson r.v. X with A = 2. 
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2.43. 


2.44, 


2.45. 


(a) Find the probability that more than three calls will arrive during any 10-minute period. 
(b) Find the probability that no calls will arrive during any 10-minute period. 
(a) From Eq. (2.40), the pmf of X is 

2 


Pxlk) = P(X =k) =e? k=0,1,... 
3 2 
Thus, P(X > 3)= —P(X < 3)=1- Ye? Pi 
k=0 : 


=1—e7(1+24+$4+ 8) 20148 
(b) P(X =0) =p,(0) = e7? ~ 0.135 


Consider the experiment of throwing a pair of fair dice. 


(a) Find the probability that it will take less than six tosses to throw a 7. 
(b) Find the probability that it will take more than six tosses to throw a 7. 


(a) From Prob. 1.31(a), we see that the probability of throwing a 7 on any toss is $. Let X denote the 
number of tosses required for the first success of throwing a 7. Then, from Prob. 2.15, it is clear that X 
is a geometric r.v. with parameter p = ¢. Thus, using Eq. (2.7!) of Prob. 2.15, we obtain 


P(X < 6) = P(X < 5) = F,(5) = 1 — (2)° = 0.598 
(b) Similarly, we get 
P(X > 6)=1-— P(X < 6) =1 — F,(6) 
= 1—[1 —(§)°] = ()° = 0.335 


Consider the experiment of rolling a fair die. Find the average number of rolls required in order 
to obtain a 6. 


Let X denote the number of trials (rolls) required until the number 6 first appears. Then X is a 
geometrical r.v. with parameter p = %. From Eq. (2.85) of Prob. 2.27, the mean of X is given by 


Thus, the average number of rolls required in order to obtain a 6 is 6. 


Assume that the length of a phone call in minutes is an exponential r.v. X with parameter 
A = 34. If someone arrives at a phone booth just before you arrive, find the probability that you 
will have to wait (a) less than 5 minutes, and (b) between 5 and 10 minutes. 


(a) From Eq. (2.48), the pdf of X is 


dre Xx >0 


Axx) = \; x<0 


Then 
5 
=1—e7°3 ~ 0,393 


0 


5 
P(X <5)= { pre 78° dx = —e 1° 
0. 


(b) Similarly, 


10 
PIS<X <10= | ype 719 dx = e795 — ce ' 0.239 
5 
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All manufactured devices and machines fail to work sooner or later. Suppose that the failure rate 

is constant and the time to failure (in hours) is an exponential r.v. X with parameter A. 

(a} Measurements show that the probability that the time to failure for computer memory chips 
in a given class exceeds 10* hours is e~ ' (0.368). Calculate the value of the parameter 4. 

(b) Using the value of the parameter 4 determined in part (a), calculate the time xy such that 
the probability that the time to failure is less than x, is 0.05. 


(a) From Eq. (2.49), the cdf of X 1s given by 


Fylx) = t-e * x>0 
wr 0 x<0 
Now P(X > 104) = 1 — P(X < 10%) = 1 — F,(104) 


A104) yo Al0* Logo] 


=!1-(l-e @ 


from which we obtain 4 = 107+. 


(b) We want 
F(X) = P(X < Xp) = 0.05 
Hence, Le =] —e 19° = 0.05 
or g 10 xo = 0,95 


from which we obtain 


Xyg = —10*1In(0.95) = 513 hours 


A production line manufactures 1000-ohm (Q) resistors that have 10 percent tolerance. Let X 
denote the resistance of a resistor. Assuming that X is a normal r.v. with mean 1000 and variance 
2500, find the probability that a resistor picked at random will be rejected. 


Let A be the event that a resistor is rejected. Then A = {X < 900} U {X > 1100}. Since {X¥ < 900} 1 
{X > 1100} = @, we have 


P(A) = P(X < 900) + P(X > 1100) = Fy(900) + [1 — Fy(1100)] 
Since X is a normal r.v, with w = 1000 and a? = 2500 (@ = 50), by Eq. (2.55) and Table A (Appendix A), 


900 — 1000 
aoe = &(—2) = 1 — &(2) 


1100 — 1 
F (1100) = o( en) = (2) 


Thus, P(A) = 2[1 — ®(2)] = 0.045 


F (900) = of 


The radial miss distance [in meters (m)] of the landing point of a parachuting sky diver from the 

center of the target area is known to be a Rayleigh r.v. X with parameter o* = 100. 

(a) Find the probability that the sky diver will land within a radius of 10 m from the center of 
the target area. 

(b) Find the radius r such that the probability that X > ris e~! (~+0.368). 


(a) Using Eq. (2.75) of Prob. 2.23, we obtain 
P(X < 10) = Fy(10) = 1 — e7 199/299 = 1 — ex 0.393 
(b) Now 
P(X >r)=1— P(X <r)=1 — F,lr) 


1 = (1 — ¢7 72/200) = 912/200 _ yt 


from which we obtain r? = 200 and r = ./200 = 14.142 m. 


CHAP. 2] RANDOM VARIABLES 71 


CONDITIONAL DISTRIBUTIONS 
2.49, Let X be a Poisson r.v. with parameter 4. Find the conditional pmf of X given B = (X is even). 
From Eq. (2.40), the pdf of X is 


Then the probability of event B is 


2 ; Ak 
P(B) = P(X =0,2,4,..)= ¥ ea 
k=even , 
Let A = {X is odd}. Then the probability of event A is 
oa dE 
P(A) = P(X = 1,3,5,..) = Perr 
k=odd . 
Now 
20 . dE ioe) iy 20 AK 
e*— + e4*—=e% —=e 4e=1 (2.101) 
on k} vy k} k=0 k! 
© Ak al Ak 00 (—Ay 
“Af a, ne -A = -A “AL -2A 2.1 2 
ee eye A RT (2102) 
Hence, adding Eqs. (2.101) and (2.102), we obtain 
Ca Vie 
P(B)= Ye? un (1 +e” *4) (2.103) 
k=even . 
Now, by Eq. (2.62), the pmf of X given B is 
P{(X =k) on BS 
k| B) = ———_—_——_ 
Py(k | B) P(B) 


If kis even, (X = k) < Band (X =k) nN B=(X =k). Ifk is odd, (X =k) n B= @. Hence, 
P(X =k) 2e7 *A* 


PB) (ise ei = even 
Px(k| B) = P(Z) 
P(B) =0 k odd 


2.50. Show that the conditional cdf and pdf of X given the event B = (a < X < b) are as follows: 


0 x<a 
_ LF x(x) — F x(a) . 
Fy(xja<X <b)= Fy(b) = Fy(a) a<x<b (2.104) 
] x>b 
0 xa 
fdxla<X < b= ale) a<x<b (2.105) 


Sx(Q) a 
0 x>b 
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Substituting B = (a < X < 6) in Eq. (2.59), we have 


Fylxla<X <b) = P(X <xla<X <b) XSI OGKX 8H} 


Pla< X <b) 
7) x<a 
Now (X<x)an(a< xX <b)=((a< X <x) a<x<b 


(a<X <b) x >b 


Hence, __ PO) 

Frja< Xs b)= Fy oH xa 
Pia<X <x) Fx) — Fyla) 

p, xX 6) = —————_ = b 

Fyxla< 8 <2)= FY Sb) Feb) Fea) SS 
Pia < X <b) 

) X <b)=——=— = 

Fy{x|a< X <b) Plax X <b) 1 x>b 


By Eq. (2.63), the conditional pdf of X given a < X <b is obtained by differentiating Eq. (2./04) with 
respect to x. Thus, 


0 X<a 
Fyxla<X <b)= ee aex<b 
‘ x [ se a 
0 x>b 


2.51. Recall the parachuting sky diver problem (Prob. 2.48). Find the probability of the sky diver 
landing within a 10-m radius from the center of the target area given that the landing is within 
50 m from the center of the target area. 


From Eq. (2.75) (Prob. 2.23) with «7 = 100, we have 
F x(x) = 1 — e 7 37/200 


Setting x = 10 and b = 50 and a = — oo in Eq. (2.104), we obtain 


F,(10) 
0 0) = F,(10 0) = —— 
P(X < 10|X < 50) = Fy({10| X < 50) F,(50) 
1 — e7 t00/200 
2.52. Let X = N(O; 7). Find E(X |X > 0) and Var(X |X > 0). 
From Eq. (2.52), the pdf of X = N(O; 07) is 
f; (x = e220?) 
Ht) 2n0 
Then by Eq. (2.105), 
0 x <0 
1 
Klx|X > 0) = Fh) = 2 —— e720?) x20 (2.106) 


{ "glQde — V2Re 
] 


ine 


Hence, E(X|X > 0) =2 


x 
xe 7207) dy 
0 
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2.53, 


Let y = x?/(207). Then dy = x dx/o?, and we get 


2 


ao {* 2 
E(X |X > 0) = { e*dy= a [2 (2.107) 
VJ 2n Jp ™ 
x 


! 
\/2no 0 


1 feel 
= | xte~ 7207) dy = Var(X) = o? (2.108) 


Sine 


Next, E(X7|X >0)=2 2_-xt202) gy 


Then by Eq. (2.37), we obtain 
Var(X |X > 0) = E(X?|X > 0) — (E(X|X > oP 
2 
= (1 - =) = 0.363 a? (2.109) 


Ar.v. X is said to be without memory, or memoryless, if 
P(X <x+t|X >t)= P(X <x) x,t>9O (2.110) 


Show that if X is a nonnegative continuous r.v. which is memoryless, then X must be an expo- 


nential r.v. 
By Eq. (1.39), the memoryless condition (2.110) is equivalent to 


PX <x+4X>)_ . 
BX Sy PA <x) 


or P(X <xt+tX>nhn=P(X <x)P(X > 2) (2.111) 
If X is a nonnegative continuous r.v., then Eq. (2.111) becomes 
Pit<X<x+0H=P0<X <x)P(X >t) 
or [by Eq. (2.25)], 
Fy(x + 0) — Fy(t) = [Fx(x) — FOO — Fx] 
Noting that F,(0) = 0 and rearranging the above equation, we get 


Fx +9 — Fylx) _ Fx(OD — Fx] 
t t 


Taking the limit as t > 0, we obtain 
F(x) = FX(O)L1 — F x(x)] (2.112) 

where F(x) denotes the derivative of F(x). Let 

R(x) = 1 — Fy(x) (2.113) 
Then Eq. (2.112) becomes 

Ry(x) = Ry(0)Ry(x) 
The solution to this differential equation is given by 

R yx) = ke®H0 


where & is an integration constant. Noting that k = R,(0) = 1 and letting R(0) = —F (0) = —f,(0) = —A, 
we obtain 


R(x) =e* 
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2.55. 


2.56. 


2.57. 


2.58. 
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and hence by Eq. (2.113), 
Fy(x) = 1 — Ry(x) = 1 — ee * x>0 


Thus, by Eq. (2.49), we conclude that X is an exponential r.v. with parameter A = f;(0) (> 0). 
Note that the memoryless property Eq. (2.110) is also known as the Markov property (see Chap. 5), and 
it may be equivalently expressed as 


P(X >xt+t|X >t) = P(X > x) x>0,t>0 (2.114) 
or P(X >x4+t) = P(X > x)P(X > t) x>0,t>0 (2.115) 


Let X be the lifetime (in hours) of a component. Then Eq. (2.114) states that the probability that the 
component will operate for at least x + ¢ hours given that it has been operational for t hours is the same as 
the initial probability that it will operate for at least x hours. In other words, the component “forgets” how 
long it has been operating. 


Note that Eq. (2.115) is satisfied when X is an exponential r.v., since P(X > x) = 1 — Fy(x) =e ** and 
eo MET) @ gt Axg at 


Supplementary Problems 


Consider the experiment of tossing a coin. Heads appear about once out of every three tosses. If this 
experiment is repeated, what is the probability of the event that heads appear exactly twice during the first 
five tosses? 


Ans, 0,329 
Consider the experiment of tossing a fair coin three times (Prob. 1.1). Let X be the r.v. that counts the 


number of heads in each sample point. Find the following probabilities: 
(a) P(X < 1); (b) P(X > 1); and (c) P(O < X < 3). 


Ans. (a) 4, (b) 4, (c) 2 
Consider the experiment of throwing two fair dice (Prob. 1.31). Let X be the r.v. indicating the sum of the 
numbers that appear. 
(a) What is the range of X? 
(b) Find (i) P(X = 3); (it) P(X < 4); and (iti) P33 < X <7). 
Ans. (a) Ry = {2,3,4,..., 12} 
(b) (i) gy; Gai) S; (iit) 5 
Let X denote the number of heads obtained in the flipping of a fair coin twice. 
(a) Find the pmf of X. 


(b) Compute the mean and the variance of X. 


Ans. (a) px(0) = 4, Px(1) = 4, px(2) = 4 
(b) E(X) = 1, Var(x) = $ 
Consider the discrete r.v. X that has the pmf 
Px(x,) = (4) x, = 1, 2, 3,... 
Let A = {€: X(0) = 1, 3, 5,7, ...}. Find P(A). 


Ans. } 
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2.59. 


2.60. 


2.61. 


2.62. 


2.63. 


2.64. 


2.65. 


2.66. 


Consider the function given by 


k 

= = 1, 2,3,... 
px)=qxe 

0 otherwise 


where k is a constant. Find the value of k such that p(x) can be the pmf of a discrete r.v. X. 
Ans. k = 6/n? 
It is known that the floppy disks produced by company 4 will be defective with probability 0.01. The 


company Sells the disks in packages of 10 and offers a guarantee of replacement that at most 1 of the 10 
disks is defective. Find the probability that a package purchased will have to be replaced. 


Ans. 0.004 


Given that X 1s a Poisson r.v. and p,(0) = 0.0498, compute £(X) and P(X > 3), 
Ans. E(X) = 3, P(X > 3) = 0.5767 


A digital transmission system has an error probability of 10~° per digit. Find the probability of three or 
more errors in 10° digits by using the Poisson distribution approximation. 


Ans. 0.08 


Show that the pmf p,(x) of a Poisson r.v. X with parameter A satisfies the following recursion formula: 


A k 
Pk +1) = kal Py(k) pxk — 1) = 7 Px(k) 


Hint: Use Eq. (2.40). 


The continuous r.v. X has the pdf 


k(x — x?) O<x<l 
Sxl) = \b otherwise 
where k is a constant. Find the value of k and the cdf of X. 
0 x<0 
Ans. k= 6; Fx) = 43x? - 2x3 O<xsi 
I x>l 


The continuous r.v. X has the pdf 


k(2x — x?) O<x<2 
0 otherwise 


Ix(x) = { 


where k is a constant. Find the value of k and P(X > 1). 


Ans. k=3;P(X > l)=4 


Ar.v. X is defined by the cdf 


0 x<0 
Fypixnk=4gx O<x<l 
k l<x 


(a) Find the value of k. 
(b) Find the type of X. 
(ec) Find (i) P(4 < X < 1); (ii) P(Z < X < 1); and (iti) P(X > 2). 
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2.67. 


2.68. 


2.69. 


2.70. 


2.71. 


2.72. 
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Ans. (a) k=1, 
(b) Mixed r.v. 
(c) (i) 45 (ii) 4; (iii) 0 
It is known that the time (in hours) between consecutive traffic accidents can be described by the exponen- 
tial r.v. X with parameter A = gg. Find (i) P(X < 60); (ii) P(X > 120); and (iii) PO < X < 100). 
Ans. (i) 0.632; (ii) 0.135; (ii) 0.658 
Binary data are transmitted over a noisy communication channel in block of 16 binary digits. The probabil- 


ity that a received digit is in error as a result of channel noise is 0.01. Assume that the errors occurring in 
various digit positions within a block are independent. 


(a) Find the mean and the variance of the number of errors per block. 
(b) Find the probability that the number of errors per block is greater than or equal to 4. 
Ans. (a) E(X) = 0.16, Var(X) = 0.158 

(b) 0.165 x 1074 


Let the continuous r.v. X denote the weight (in pounds) of a package. The range of weight of packages is 
between 45 and 60 pounds. 


(a) Determine the probability that a package weighs more than 50 pounds. 
(6) Find the mean and the variance of the weight of packages. 

Hint: Assume that X is uniformly distributed over (45, 60). 

Ans. (a) 3; (b) E(X) = 52.5, Var(X) = 18.75 


In the manufacturing of computer memory chips, company A produces one defective chip for every nine 
good chips. Let X be time to failure (in months) of chips. It is known that X is an exponential r.v. with 
parameter 1 = 4 for a defective chip and 4 = 74 with a good chip. Find the probability that a chip pur- 
chased randomly will fail before (a) six months of use; and (5) one year of use. 


Ans. (a) 0.501; (b) 0.729 


The median of a continuous r.v. X is the value of x = x9 such that P(X > x9) = P(X < xo). The mode of X 
is the value of x = x,, at which the pdf of X achieves its maximum value. 


(a) Find the median and mode of an exponential r.v. X with parameter A. 
(b) Find the median and mode of a normal rv. X = N(p, 6°). 
Ans. (a) Xo = (In 2)/A = 0.693/A, x,, = 0 
(b) Xo = Xm = HB 
Let the r.v. X denote the number of defective components in a random sample of n components, chosen 


without replacement from a total of N components, r of which are defective. The r.v. X is known as the 
hypergeometric r.v. with parameters (N, r, n). 


(a) Find the pmf of X. 
(b) Find the mean and variance of X. 


Hint: To find E(X), note that 


To find Var(X), first find ELX(X — 1)]. 
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2.73. 


2.74. 


2.75. 


2.76. 


2.77. 


(") 
 eonm dz) men =o(5)1- 3) 


A lot consisting of 100 fuses is inspected by the following procedure: Five fuses are selected randomly, and 
if all five “blow” at the specified amperage, the lot is accepted. Suppose that the lot contains 10 defective 
fuses. Find the probability of accepting the lot. 


Ans. (a) px(x) = 


Hint: Let X bea r.v. equal to the number of defective fuses in the sample of 5 and use the result of Prob. 
2.72. 


Ans. 0.584 
Consider the experiment of observing a sequence of Bernoulli trials until exactly r successes occur. Let the 


r.v. X denote the number of trials needed to observe the rth success. The r.v. X is known as the negative 
binomial r.v. with parameter p, where p is the probability of a success at each trial. 


(a) Find the pmf of xX. 
(b) Find the mean and variance of X. 


Hint: To find E(X), use Maclaurin’s series expansions of the negative binomial h(q) = (1 — q)~’ and its 
derivatives h’(g) and h’(q), and note that 


a k-1 2 (x1 
hq) =(1-g) "= ¥ ("* )at = » (* a 


To find Var(X), first find E((X — r)(X — r — 1)] using A’(q). 


Ans. (a) pos) = ("7 [pra - er x=rnrti,... 


(b) E(X) = (+) Var(x) ==?) 
p P 


Suppose the probability that a bit transmitted through a digital communication channel and received in 
error is 0.1. Assuming that the transmissions are independent events, find the probability that the third 
error occurs at the 10th bit. 


Ans. 0.017 


Ar.v. X is called a Laplace r.v. if its pdf is given by 
fy(x) = ke 4 A>0, -a<x<@ 
where & is a constant. 
(a) Find the value of k. 
(5) Find the cdf of X. 
(c) Find the mean and the variance of X. 
Ans. (a) k=A/2 (b) Fyx= ie xs 
j-te** x20 
(c) E(X) = 0, Var(X) = 2/22 


Ar.v, X is called a Cauchy r.v. if its pdf is given by 
k 


a? + x? 


fy(x) = 


—-oeO<xX< O 


78 


RANDOM VARIABLES 


where a (>0) and k are constants. 


(a) Find the value of k. 
(b) Find the cdf of X. 
(c) Find the mean and the variance of X. 


1 J _, {* 
Ans. (a) k=a/nx (b)  Fy(x) = 5 + x tan 3 


(c) E(X) = 0, Var(X) does not exist. 


[CHAP 2 


Chapter 3 


Multiple Random Variables 


3.1 INTRODUCTION 


In many applications it is important to study two or more r.v.’s defined on the same sample 
space. In this chapter, we first consider the case of two r.v,’s, their associated distribution, and some 
properties, such as independence of the r.v.’s. These concepts are then extended to the case of many 
r.v.’s defined on the same sample space. 


3.2. BIVARIATE RANDOM VARIABLES 


A. Definition: 


Let S be the sample space of a random experiment. Let X and Y be two r.v.’s. Then the pair (X, 
Y) is called a bivariate r.v. (or two-dimensional random vector) if each of X and Y associates a real 
number with every element of S. Thus, the bivariate r.v. (X, Y) can be considered as a function that to 
each point ¢ in S assigns a point (x, y) in the plane (Fig. 3-1). The range space of the bivariate r.v. (X, 
Y) is denoted by R,, and defined by 


R,y = {(x, y); eS and X(Q) = x, YQ) = y} 


If the r.v.s X and Y are each, by themselves, discrete r.v.’s, then (X, Y) is called a discrete 
bivariate r.v. Similarly, if X and Y are each, by themselves, continuous r.v.’s, then (X, Y) is called a 
continuous bivariate r.v. If one of X and Y is discrete while the other is continuous, then (X, Y) is 
called a mixed bivariate r.v. 


Fig. 3-1 (X, Y) asa function from S to the plane. 
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33 JOINT DISTRIBUTION FUNCTIONS 
A. Definition: 


The joint cumulative distribution function (or joint cdf) of X and Y, denoted by Fy,(x, y), is the 
function defined by 


Fyylx, y) = P(X <x, Y < y) (3.1) 


The event (X <x, Y < y) in Eq. (3./) is equivalent to the event A 7 B, where A and B are events of S 
defined by 


A={CeS;X()<x} and B= {LeES; YQ<y} (3.2) 
and P(A) = Fy(x) P(B) = F,(y) 
Thus, Fy x, y) = P(A 7 B) (3.3) 


If, for particular values of x and y, A and B were independent events of S, then by Eq. (1.46), 
F yv(x, y) = P(A 0 B) = P(A)P(B) = F x(x) Fy(y) 


B. Independent Random Variables: 
Two r.v.’s X and Y will be called independent if 
F yylx, y) = F(x) Fy(y) (3.4) 


for every value of x and y. 


C. Properties of Fyy(x, y): 


The joint cdf of two r.v.’s has many properties analogous to those of the cdf of a single r.v. 


lL. OK< Fyy(x, y) <1 (3.5) 
2. Ifx, <x,,and y, < y., then 
FrylX1, Yi) S FrylX2. ¥1) S Fryl%2, Y2) (3.6a) 
FyAlX1, Yi) S FylXy, Y2) < Frvlxz, ¥2) (3.60) 
3. lim Fyy(x, y) = Fyy(o, «) = 1 (3.7) 
"2 
4. lim Fyy(x, y) = Fyf—«, y) = 0 (3.8a) 
lim Fxy(x, y) = Fyy(x, —20) =0 (3.8b) 
yoo 
5. lim Fyy(x, y) = Fxy(a™, y) = Fyy(a, y) (3.9a) 
lim Fyy(x, y) = Fyy(x, 67) = Fryylx, 6) (3.9b) 
yobr 
6. P(x, < X <x, Y < y) = Fyylx2, vy) — Fyy(X,, y) (3.10) 
P(X SX, yy < Y S y2) = Fry, Y2) — Fy 1) (3.11) 
7. Ifx, <x, and y, < y2, then 
FyylX25 Y2) — Frys, 2) — Fry(¥2. 1) + Fryl(Xy, yi) 20 (3.12) 


Note that the left-hand side of Eq. (3.12) is equal to P(x, < X < x,y, < Y < y,) (Prob. 3.5). 
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D. Marginal Distribution Functions: 


Now lim(X <x, ¥ < y)=(X <x, ¥ < 0) =(X <x) 


since the condition y < 00 is always satisfied. Then 


lim Fyy(x, y) = Fyy(x, co) = Fy(x) (3.13) 
yoru 
Similarly, lim Fyy(x, y) = Fyy(co, y) = Fyly) (3.14) 


The cdf’s Fy(x) and Fy(y), when obtained by Eqs. (3.13) and (3.14), are referred to as the marginal 
cdf’s of X and Y, respectively. 


3.4 DISCRETE RANDOM VARIABLES—JOINT PROBABILITY MASS FUNCTIONS 
A. Joint Probability Mass Functions: 


Let (X, Y) be a discrete bivariate r.v., and let (X, Y) take on the values (x;, y,) for a certain 
allowable set of integers i and j. Let 


Pxylx;, y)) = P(X =x;, Y = y;) (3.15) 
The function py,(x;, y;) is called the joint probability mass function (joint pmf) of (X, Y). 


B. Properties of pyy(x,, y;): 


lL. OK pxy(%, ¥) <1 (3.16) 
2. YY pxXlxi, ¥) = 1 (3.17) 
Mi Vj 
3. PU(X, Ye A= YY pela, y) (3.18) 
(xn ype Ra 


where the summation is over the points (x,, y,) in the range space R, corresponding to the event A. 
The joint cdf of a discrete bivariate r.v. (X, Y) is given by 


Fyy(Xx, y) = Y Y Pxy(X;, y;) (3.19) 


xXjSX ypsy 


C. Marginal] Probability Mass Functions: 


Suppose that for a fixed value X = x,, the r.v. Y can take on only the possible values y, (j = 1, 2, 
...,n) Then 


P(X = Xj) = pylx) = » Pxy(X, Ys) (3.20) 


where the summation is taken over all possible pairs (x;, y,;) with x; fixed. Similarly, 


P(Y = y) = py) = > pxylXis ¥)) (3.21) 


where the summation is taken over all possible pairs {x;, y;) with y, fixed. The pmf’s p,(x,) and p,(y,), 
when obtained by Eggs. (3.20) and (3.21), are referred to as the marginal pmf’s of X and Y, respectively. 


D. Independent Random Variables: 
If X and Y are independent r.v.’s, then (Prob. 3.10) 


PxylXis Vj) = Px(<i)Py(y)) (3.22) 
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3.5 CONTINUOUS RANDOM VARIABLES—JOINT PROBABILITY DENSITY 
FUNCTIONS 


A. Joint Probability Density Functions: 
Let (X, Y) be a continuous bivariate r.v. with cdf Fyy(x, y) and let 


07 F yy(X, y) 
Ox Oy 


The function fyy(x, y) is called the joint probability density function (joint pdf) of (X, Y). By 
integrating Eq. (3.23), we have 


Syl, y) = (3.23) 


x y 
Fyy(x, y) = { { SxS, n) dn dg (3.24) 
B. Properties of fyy(x, y): 
1. fyylx, y) 20 (3.25) 
2. { { Sxylx, y) dx dy =1 (3.26) 
3. fyy{x, y) is continuous for all values of x or y except possibly a finite set. 
4. P[(X, Yye A] = {| Sxyl%, y) dx dy (3.27) 
Ra 
qd b 
5. Pa<X<be<¥<d=[ [ trots 9 ax ay (3.28) 


Since P(X = a) =0 = P(Y =c) [by Eq. (2.19)], it follows that 
Pla<X<be<Y<d=Pia<X<bc<Yead=PlacxX<bece<Y<d 


a fb 
=Pa<X<be<¥<d=| [fort 9 ae ay (3.29) 


C. Marginal Probability Density Functions: 
By Eq. (3.13), 


F (x) = Fyy(x, oc) = { [ Sxy(6, n) dn dg 


dF *° 
Hence fel) = oS = | Sayles, n) dn 
x 00 
or I(x) = [ Sxy(x, y) dy (3.30) 
Similarly, fly) = [’ Suyl% y) dx (3.31) 


The pdf’s f(x) and f;{y), when obtained by Eqs. (3.30) and (3.31), are referred to as the marginal pdf's 
of X and Y, respectively. 
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D. Independent Random Variables: 
If X and Y are independent r.v.’s, by Eq. (3.4), 


Fyy(x, y) = Fy(x)Fy(y) 


PF xy(x, y) _ <a ao 
Then ax dy Ox F(x) By r(y) 
or Sxl ¥) = LO) SAY) (3.32) 


analogous with Eq. (3.22) for the discrete case. Thus, we say that the continuous r.v.’s X and Y are 
independent r.v.’s if and only if Eq. (3.32) is satisfied. 


3.6 CONDITIONAL DISTRIBUTIONS 
A. Conditional Probability Mass Functions: 
If (X, Y) is a discrete bivariate r.v. with joint pmf py,(x;, ,), then the conditional pmf of Y, given 
that X = x,, is defined by 
Xis ;) 
Prin(yylx) = PHD py > 0 (3.33) 
Px(X,) 


Similarly, we can define py)y(x;| y,) as 


Pp Xj ? j 
paplsily) = PW paly) > 0 (3.34) 
Pyly;) 
B. Properties of py, x(y;|x,): 
1. OS pyjx(yj| xi) <1 (3.35) 
2. Y pyx(yjl x) = 1 (3.36) 
yj 


Notice that if X and Y are independent, then by Eq. (3.22), 
Pyix(y;| x) = Py(y;) and Pxyy(Xi 1 y;) = py(x;) (3.37) 


C. Conditional Probability Density Functions: 


If (X, Y) is a continuous bvivariate r.v. with joint pdf f,y(x, y), then the conditional pdf of Y, given 
that X = x, is defined by 


fralvisy =F) fix > 0 (3.38) 
Similarly, we can define fy,y(x | y) as 
favtely = AAD fay) > 0 (3.39) 


D. Properties of fy | x(y|x): 
Ll. fyyxty|x) 20 (3.40) 


2. [" Srxv |x) dy = 1 (3.41) 
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As in the discrete case, if X and Y are independent, then by Eq. (3.32), 
Fyix( 1X) = fyly) and Sxl y) = fy(x) (3.42) 


3.7 COVARIANCE AND CORRELATION COEFFICIENT 
The (k, n)th moment of a bivariate r.v. (X, Y) is defined by 


> > xKy "PxrlX;, y;) (discrete case) 
Men = E(X*Y") = 47 a, (3.43) 
x*y"fyy(x, y) dx dy (continuous case) 


If n = 0, we obtain the kth moment of X, and if k = 0, we obtain the nth moment of Y. Thus, 

mio = E(X) = ux and Mo, = E(Y) = by (3.44) 
If (X, Y) is a discrete bivariate r.v., then using Eqs. (3.43), (3.20), and (3.21), we obtain 

Hy = E(X) = Y Y Xi Pxy(X;, y;) 


yy x 
=) [5 PxylX, »| =¥ x; px(x;) (3.45a) 
tall yj Xi 
By = EY) =) ¥ yypxvlx, y,) 
xi yy 


=) 7p PxylX;, | => y; py) (3.45b) 
Ys xt 2] 


Similarly, we have 
E(X?) =) Dx pxvl%, Y) = 2 x," px(x,) (3.46a) 


¥y Xi 


E(Y?)=¥ ¥ y?pyl(x:, y) = x y;?Pyy) (3.46b) 


vy Xi 


If (X, Y) is a continuous bivariate r.v., then using Eqs. (3.43), 3.30), and (3.31), we obtain 


Hy = E(X) = [ { Xfxy(x, y) dx dy 


= [" Af ° Sxy(x, y) iy| dx = [; xfx(x) dx (3.47a) 
Hy = E(Y) = {- [" Wxylx, y) dx dy 
= { Af Suv, y) ax| dy= { Wry) dy (3.47) 
Similarly, we have 
E(X?) = [" [" x7fyylx, y) dx dy = [’ x*f(x) dx (3.48a) 


E(Y?) = [" [- y*favlx, y) dx dy = [ y*fy) dy (3.48b) 
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The variances of X and Y are easily obtained by using Eq. (2.37). The (1, 1)th joint moment of (X, Y), 
mi, = E(XY) (3.49) 
is called the correlation of X and Y. If E(X Y) =0, then we say that X and Y are orthogonal. The 
covariance of X and Y, denoted by Cov(X, Y) or oxy, is defined by 
Con(X, Y) = oxy = E[(X — pxX¥ — py)] (3.50) 
Expanding Eq. (3.50), we obtain 
Cov(X, Y) = E(XY) — E(X)E(Y) (3.51) 
If Cov(X, Y) = 0, then we say that X and Y are uncorrelated. From Eq. (3.51), we see that X and Y 
are uncorrelated if 
E(XY) = E(X)E(Y) (3.52) 
Note that if X and Y are independent, then it can be shown that they are uncorrelated (Prob. 
3.32), but the converse is not true in general; that is, the fact that X and Y are uncorrelated does not, 


in general, imply that they are independent (Probs. 3.33, 3.34, and 3.38). The correlation coefficient, 
denoted by p(X, Y) or pyy, is defined by 


CowX, ¥)_ oxy 


A(X, Y) = pxy = 
Ox Oy Ox Oy 


(3.53) 


It can be shown that (Prob. 3.36) 
| Pxy| <i or -l< Pry S 1 (3.54) 


Note that the correlation coefficient of X and Y is a measure of linear dependence between X and Y 
(see Prob. 4.40). 


3.8 CONDITIONAL MEANS AND CONDITIONAL VARIANCES 


If (X, Y) is a discrete bivariate r.v. with joint pmf pyy(x;, y,), then the conditional mean (or condi- 
tional expectation) of Y, given that X = x,, is defined by 


Py |x = E(Y |x) = Y Yj Pyix(V;| %) (3.55) 
y 


The conditional variance of Y, given that X = x,, is defined by 
OF\x, = Var(Y |x) = EL(Y — pyje)* |x] = V5 — Byjx)?Pri(y| xd (3.56) 
yy 


which can be reduced to 

Var(Y | x,) = E(Y?|x,) — [E(Y |x) (3.57) 
The conditional mean of X, given that Y = y,, and the conditional variance of X, given that Y = Vp 
are given by similar expressions. Note that the conditional mean of Y, given that X = x,, is a func- 
tion of x; alone. Similarly, the conditional mean of X, given that Y = y,, is a function of y, alone. 


If (X, Y) is a continuous bivariate r.v. with joint pdf fyy(x, y), the conditional mean of Y, given 
that X = x, is defined by 


By\x = E(Y |x) = [" vSrix(y lx) dy (3.58) 


The conditional variance of Y, given that X = x, is defined by 


ioe) 


oy), = Var(Y |x) = E[(Y — py)" |x] = [ O- Lyx) Srix(y |x) dy (3.59) 
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which can be reduced to 
Var(Y |x) = E(Y?|x) — [E(Y|x)]? (3.60) 


The conditional mean of X, given that Y = y, and the conditional variance of X, given that Y = y, 
are given by similar expressions. Note that the conditional mean of Y, given that X = x, is a function 
of x alone. Similarly, the conditional mean of X, given that Y = y, is a function of y alone (Prob. 
3.40). 


3.9 N-VARIATE RANDOM VARIABLES 


In previous sections, the extension from one r.v. to two r.v.’s has been made. The concepts can be 
extended easily to any number of r.v.’s defined on the same sample space. In this section we briefly 
describe some of the extensions. 


A. Definitions: 


Given an experiment, the n-tuple of r.v.’s (X,, X2, ..., X,) is called an n-variate r.v. (or n- 
dimensional random vector) if each X;, i= 1, 2, ..., n, aSsociates a real number with every sample 
point ¢ e S. Thus, an n-variate r.v. is simply a rule associating an n-tuple of real numbers with every 
CeS. 


Let (X,,..., X,) be an n-variate r.v. on S. Then its joint cdf is defined as 
Frys xX, oes Xa) = POX, SX, ..., XS Xp) (3.61) 
Note that 
Fy... x,(,..-, ©) = 1 (3.62) 


The marginal joint cdf’s are obtained by setting the appropriate X,s to +o in Eq. (3.61). For 
example, 


Fry ie xg (Xa ees Xp) = Pye age x (Xp os Xn ty ©) (3.63) 
F yxy. X2) = Fyyxgxy  x(X1 Xn, ©, ..-, ©) (3.64) 

A discrete n-variate r.v. will be described by a joint pmf defined by 
Pry ee xX ts 0s Xn) = P(X, = Xy, 000, Xy = Xp) (3.65) 


The probability of any n-dimensional event A is found by summing Eq. (3.65) over the points in the 
n-dimensional range space R, corresponding to the event A: 


PU(X,,...,X) € Al = Yoo DY py, x (X15 Xa (3.66) 


(Xi, es Xa) E Ra 


Properties of px, .. x,(% 19 -++5 Xa)! 
1. O< py, ... x,(%1, ---, X,) S 1 (3.67) 
2 VY py ee x (My oes Xq) = L (3.68) 
x4 Xn 


The marginal pmf’s of one or more of the r.v.’s are obtained by summing Eq. (3.65) appropriately. 
For example, 


Pry Xp (Xo ey Xp) = Py oe xg(X ps o> Xp) (3.69) 


Px slr) = DoD Pay me xd(X te 229%) (3.70) 
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Conditional pmf’s are defined similarly. For example, 


Pxy-- x(%15 uty Xn) 


Pra X10 Xn— 1% l X49 06s Xp) = 
nl X41 w-i\‘'n ? al PX Xq (Xs oe Xn) 


A continuous n-variate r.v. will be described by a joint pdf defined by 


_ OF yy xq(Xa9 -+ +» Xn) 


Fry xgl¥1y 0-5 Xn) 


OX, +++ OX, 
Then Fy,... xX vate Xn) = [’ uv {’ Sx, won XqC6 15 tee rm) dg, re dé, 
and PU(X,, ..., X,) € AJ = fof fry xglGrs ves &,) dy oo dé, 
(Xp, 0, an) ERA 


Properties of fx... x(X15 +65 Xa)! 
Le fyy oe x(Xay ++) X,) BO 


2. | of Fig xg Xs oes Xp) Ny 0 dx, = 1 
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(3.71) 


(3.72) 


(3.73) 


(3.74) 


(3.75) 


(3.76) 


The marginal pdf’s of one or more of the r.v.’s are obtained by integrating Eq. (3.72) appropriately. 


For example, 


wo 
Sx vo Xqe (Xs tty Xn-1) = { Sx, XX 15 tty Xn) dx, 
— 2 


Sx,(%1) = [ a [ Fup xX wary X,) dx, ote dx, 


Conditional pdf’s are defined similarly. For example, 


Sx, we X(X as sees Xn) 
Fig Ky beng Xq-1) 


The r.v.’s X,,.-., X,, are said to be mutually independent if 


Para we Xn (Xn Xp eres Mpa) = 


Pay xstas --e0 Xn) = T] Pad) 


for the discrete case, and 


Sx, x (Xap ane Xn) = I] Fick) 


for the continuous case. 
The mean (or expectation) of X, in (X,,..., X,,) is defined as 


YY payee xX ty 0 Xn) (discrete case) 


Hu; = E(X;) = eo “ a 
| nee | Xe Sy xg(Xae ees Xp) AX, + dx, (continuous case) 


n 
=o _ 


The variance of X, is defined as 
6,’ = Var(X;) = E[(X; — »)"J 


(3.77) 


(3.78) 


(3.79) 


(3.80) 


(3.81) 


(3.82) 


(3.83) 
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The covariance of X, and X; is defined as 


a,j = Cov(X;, Xj) = EU(X; — uXX;— 4p) (3.84) 
The correlation coefficient of X; and X;, is defined as 
Cc 1, X; ij 
y= NR DL ta (3.85) 
G;0; 0,0; 


3.10 SPECIAL DISTRIBUTIONS 
A. Miultinomial Distribution: 


The multinomial distribution is an extension of the binomial distribution. An experiment is 


termed a multinomial trial with parameters p,, p2,..., p,, if it has the following conditions: 
1. The experiment has k possible outcomes that are mutually exclusive and exhaustive, say A,, A2, 
wavy Ag. 
k 
2. P(A;) =P; i=1,...,k and Y p=! (3.86) 
i=1 


Consider an experiment which consists of n repeated, independent, multinomial trials with param- 
eters p,, P2,..., p,. Let X; be the r.v. denoting the number of trials which result in A;. Then (X,, X2, 
..., X,) is called the multinomial r.v. with parameters (n, p,, P2, .-., P,) and its pmf is given by (Prob. 
3.46) 


Px 4x2 tee xX 45 X2 preey x;) = a er Dy*' pr”? wv Pi (3.87) 
1 , ke 


k 
for x; =0,1,...,2,i=1,...,k, such that }’ x, =n. 
i=] 


Note that when k = 2, the multinomial distribution reduces to the binomial distribution. 


B. Bivariate Normal Distribution: 
A bivariate r.v. (X, Y) is said to be a bivariate normal (or gaussian) r.v. if its joint pdf is given by 


1 
Favs y) = ino,o(1 — py exp[—24(x, y)] (3.88) 


_ 2 _ _ _ 2 
where q(x, y) = ; (A) _ 2p (4) 4 (=) | (3.89) 


and p,, Hy, 0,’, 0,” are the means and variances of X and Y, respectively. It can be shown that p is 
the correlation coefficient of X and Y (Prob. 3.50) and that X and Y are independent when p = 0 
(Prob. 3.49). 


C. N-variate Normal Distribution: 


Let (X,, ..., X,) be an n-variate r.v. defined on a sample space S. Let X be an n-dimensional 
random vector expressed as ann x 1 matrix: 


x=|: (3.90) 
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Let x be an n-dimensional vector (n x 1 matrix) defined by 


xy 
x=]: (3.91) 
Xn 
The n-variate r.v. (X,,..., X,) is called an n-variate normal r.v. if its joint pdf is given by 
I 1 Ty-l 
Lx) = (ny? det K | exp ~ 7(x — p)K~ "(x — p) (3.92) 
where T denotes the “transpose,” pt is the vector mean, K is the covariance matrix given by 
Hy E(X;) 
=EPX}=|: J=] : 
p = ELX] : (3.93) 
Bn E(X,) 
Fy, ***  Fyp 
K=|] : 7+. : G,, = Cow(X;, Xj) (3.94) 
On) ue Onn 
and det K is the determinant of the matrix K. Note that fy(x) stands for fy, ... x (X1, ---, X,)- 
Solved Problems 


BIVARIATE RANDOM VARIABLES AND JOINT DISTRIBUTION FUNCTIONS 


3.1. Consider an experiment of tossing a fair coin twice. Let (X, Y) be a bivariate r.v., where X is the 
number of heads that occurs in the two tosses and Y is the number of tails that occurs in the two 


tosses, 

(a) What is the range Ry of X? 

(b) What is the range Ry of Y? 

(c) Find and sketch the range Ry y of (X, Y). 

(d) Find P(X = 2, Y = 0), P(X =0, Y = 2), and P(X = 1, Y = 1), 


The sample space S of the experiment is 
S = {HH, HT, TH, TT} 

Ry = {0, J, 2} 

Ry = {0, 1, 2} 

Ryy = {(2, 0), (1, 1), (0, 2)} which is sketched in Fig. 3-2. 

Since the coin is fair, we have 
P(X = 2, Y =0)= P{HH} =4 
P(X =0, Y =2)=P{TT} =4 
P(X =1, Y = 1) = P{HT, TH} =3 


3.2. Consider a bivariate r.v. (X, Y). Find the region of the xy plane corresponding to the events 


A={X+Y <2} B= {X?+ Y? <4} 
C = {min(X, Y) < 2} D = {max(X, Y) < 2} 
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SK 
a <~ 


| Ke 


{d) 


pe 
we 
— 
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3.3. 


3.4. 


3.5. 


The region corresponding to event A is expressed by x + y < 2, which is shown in Fig. 3-3(a), that is, 
the region below and including the straight line x + y = 2. 

The region corresponding to event B is expressed by x* + y* < 2?, which is shown in Fig, 3-3(b), that 
is, the region within the circle with its center at the origin and radius 2. 

The region corresponding to event C is shown in Fig. 3-3(c), which is found by noting that 


{min(X, Y) <2} =(X <2 u(¥ <2) 
The region corresponding to event D is shown in Fig. 3-3(d), which is found by noting that 


{max(X, Y) <2} =(X <2) a(¥ <2) 


Verify Eqs. (3.7), (3.8a), and (3.85). 
Since {X < 00, Y < oo} = S and by Eq. (7.22), 
P(X < 0, Y < ©) = Fyy(c0, 0) = P(S)=1 
Next, as we know, from Eq. (2.8), 


P(X = —w)= P(Y = —w)=0 


Since (X =~-0, Y<y)c(X =—a@) and (X <x, Y $< -—aw)c(Y = —w) 


and by Eq. (1.27), we have 
P(X = —0, ¥ Sy) = Fyyf—o, y) = 0 


P(X <x, Y = —0) = Fyx, -0) =0 


Verify Eqs. (3.10) and (3.11). 
Clearly (X<x,,Y Syl=(X <x, VY es yu(x,<X <x,,Y <y) 


The two events on the right-hand side are disjoint; hence by Eq. (1.23), 
P(X <x2,¥ <Sy)=P(X <x, ¥ Sy) 4+ Plxy <X <x,,V <y) 
or P(x, <X <x,,Y <y)=P(X <x,, VY Sy) -P(X Sx, VY Sy) 
= Fyy(X2, ¥) — Fry(%1 ¥) 
Similarly, 
(X<x,¥ sy)=(X <x YS y)U(X Sxy,< Y Sy) 
Again the two events on the right-hand side are disjoint, hence 
P(X <x, Y¥ <y,)= P(X <x, VY <y,)+ P(X Sxy, < Y¥ < yo) 
or PX Sx,y,< Y Sy.) =P(X <x, ¥ Sy.) -—-P(X <x, Y <y)) 
= Fyy(% Yo) — Frye, yi) 


Verify Eq. (3.12). 
Clearly 
(x, < X $x, ¥ Sy) =(X,< XS x.,¥ Sy) le <X Sx2,y, < Y Sy) 
The two events on the right-hand side are disjoint; hence 
P(x, < X SX, Y S yo) = P(x, < X Sx, ¥ Sy) + Plxy < X < x2, 9, < Y < yp) 
Then using Eq. (3.10), we obtain 


P(x) <X Sx, 9, < VY Sy.) = Plxy < X SQ, ¥ Sy.) — P(x < X SQ, VY Sy,) 
= FryylXa, V2) — Feeyl%1s ¥2) — CP yvl%2, 1) — Peylty, ¥1)] 
= Fryy(%25 Yo) — Puy, Ya) — Pry(¥a. Vy) + Frylp yi) (3.95) 
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3.7. 


3.8. 
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Since the probability must be nonnegative, we conclude that 
F yy(%25 Ya) — Fyvl%, V2) — Frylxa, Vi) + Frye yi) 20 


ifx, > x, andy, > y,. 


Consider a function 
t-e't) Q<x<w,0<cy<w 
0 otherwise 


F(x, n=4 


Can this function be a joint cdf of a bivariate r.v. (X, Y)? 


[CHAP 3 


It is clear that F(x, y) satisfies properties 1 to 5 of a cdf (Eqs. (3.5) to (3.9)]. But substituting F(x, y) in 


Eq. (3.12) and setting x, = y, = 2 and x, = y, = 1, we get 


F(2, 2) — F(l, 2) - FQ, 1)+ FU, D= (Ql -—e J -(-—e 4-1 -—e 4) 4+ -—e74) 


= —e7*42e73? —e 7 = —(e7? -e bf <0 


Thus, property 7 [Eq. (3.12)] is not satisfied. Hence F(x, y) cannot be a joint cdf. 


Consider a bivariate r.v. (X, Y). Show that if X¥ and Y are independent, then every event of the 


form (a < X < b) is independent of every event of the form (c < Y < d). 
By definition (3.4), if X and Y are independent, we have 
F yy(x, y) = Fy(x)F y(y) 
Setting x, = a,x, =b, y, =c, and y, = din Eq. (3.95) (Prob. 3.5), we obtain 
Pla<X <b,c< Y <a)= Fyyb, d) — Fya, d) — Fyy(b, c) + Fyya, c) 

= Fy(b)F Ad) — Ffa)F ld) — Fy(b)F fc) + Fyla)F Ac) 
= [F xh) — F(a) LF yd) — Frye) 
=P(ia< X <d)P(c< Y <a) 

which indicates that event (a < X < b) and event (c < Y < d) are independent [Eq. (/.46)]. 


The joint cdf of a bivariate r.v. (X, Y) is given by 
(d—e ™\l-—e*) x>0,y>04,p>0 
Fy fx, y) = ‘ 
x1% J) {¢ otherwise 

(a) Find the marginal cdf’s of X and Y. 
(b) Show that X and Y are independent. 
(c) Find P(X < 1, Y < 1), P(X < 1), P(Y > band P(X > x, Y > y). 
(a) By Eas. (3.13) and (3.14), the marginal cdf’s of X and Y are 
l—e™ x20 
0 x<O 
l-e 8” ys0 
0 y<0O 
(b) Since Fxy(x, y) = Fy{x)F yy), X and Y are independent. 
(c) P(X <LY<N=F, fl, )=—e Ml -—e 4) 

PX <1)=FyA)=U -—e 9 

PY >I =1-P(Y <I) =1-FYl)=e? 


Fy(x) = Fyylx, co) = | 


Fyy) = Fxy(, y) = { 
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By De Morgan’s law (1.15), we have 
(KS) A(Y>)=(K>HvVS IN =(X Sx UY sy) 
Then by Eq, (1.29), 
PUX > x) A (¥>W) = P(X <x) + PY Sy)— P(X <x, ¥ <y) 
= Fy(x) + Fry) — Frys y) 
=(l-e ™)+(1-e°”) -(1 —e7™X1 — e7 ®) 
=1-e *e” 
Finally, by Eq. (1.25), we obtain 
P(X >x,Y>y=1-PUX>xalY > yp] =e te” 


3.9. The joint cdf of a bivariate r.v. (X, Y) is given by 
0 x<0O or y<O0 
Py O<x<a, O<y<b 
Fyylx, y)=4P2 x24 Os y<b 
D3 O<x<a, y>b 
1 x>a, y>b 


(a) Find the marginal cdf’s of X and Y. 
(b) Find the conditions on p,, p,, and p; for which X and Y are independent. 


(a) By Eq. (3.13), the marginal cdf of X is given by 


0 x<0 
Fy(x) = Fyylx, ©) =4py3 OSx<a 
1 x>a 
By Eq. (3.14), the marginal cdf of Y is given by 
0 y<0 
Fyy) = Fy. y)=4p.2 OS y<b 
1 yab 


(b) For X and Y to be independent, by Eq. (3.4), we must have Fy,(x, y) = Fy(x)F y(y). Thus, for 0 < x < a, 
0<y <b, we must have p, = p,p; for X and Y to be independent. 


DISCRETE BIVARIATE RANDOM VARIABLES—JOINT PROBABILITY MASS 
FUNCTIONS 


3.10. Verify Eq. (3.22). 
If X and Y are independent, then by Eq. (1.46), 
PxylX;, yj) = P(X = x;,Y = y) = P(X = x)P(Y = y) = Py(x;)Pyly;) 


3.11. Two fair dice are thrown. Consider a bivariate r.v. (X, Y). Let X¥ = 0 or | according to whether 
the first die shows an even number or an odd number of dots. Similarly, let ¥ = 0 or 1 according 


to the second die. 
(a) Find the range Ry, of (X, Y). 
(b) Find the joint pmf’s of (X, Y). 
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(a) The range of (X, Y) is 


Ryy = {(0, 0), (0, 1), a, 0), a, 1} 


(b) It is clear that X and Y are independent and 


Thus 


P(X =0)=P(X =1)=3=} 
P(Y =0)=P(Y =1)=2 


Px, J = P(X =i Y=f=PW(X=)P(Y=f)=%4 ijf=0,1 


3.12. Consider the binary communication channel shown in Fig. 3-4 (Prob. 1.52). Let (X, Y) be a 
bivariate r.v., where X is the input to the channel and Y is the output of the channel. Let 


P(X 
(a) 
(b) 
{c) 
(a) 


(b) 


= 0) = 0.5, P(Y = 1| X = 0) = 0.1, and P(Y = 0|X = 1) = 0.2. 
Find the joint pmf’s of (X, Y). 

Find the marginal pmf’s of X and Y. 

Are X and Y independent? 

From the results of Prob. 1.52, we found that 


P(X =1)=1~ P(X =0)=05 
P(Y =0)X =0)=09 PY =1|X =1)=08 


Then by Eq. (/.41), we obtain 
P(X =0, Y = 0) = P(Y =0|X =0)P(X = 0) = 0.9(0.5) = 0.45 
P(X =0, Y =1)= P(Y = 1|X =0)P(X = 0) = 0.10.5) = 0.05 
P(X =1, Y=0)= P(Y =0[X = IP(X = 1) = 0.2(0.5) = 0.1 
P(X =1,¥ == PY =1|X =1)P(X = 1) =0.8(0.5) = 04 
Hence, the joint pmf’s of (X, Y) are 
Pxy(0, 0) = 0.45 Pxy(O, 1) = 0.05 
Pxy(1, 0) = 0.1 Pxy(l, 1) = 0.4 
By Eq. (3.20), the marginal pmf’s of X are 
px(0) = ¥° pyy(0, y,)) = 0.45 + 0.05 = 0.5 
yy 


px(l) = ¥ pyy(l, y) = 0.1 +04 = 0.5 


yy 

By Eq. (3.21), the marginal pmf’s of Y are 
pO) = ¥° pyy(x;, 0) = 0.45 + 0.1 = 0.55 
py(1) = ¥ pyy(x;, 1) = 0.05 + 0.4 = 0.45 


m4 


P(Y =01X =0) 


PIY2 tlX =1) 


Fig. 3-4 Binary communication channel. 
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(c) Now 
px(0)py(0) = 0.5(0.55) = 0.275 ¥ pxy(0, 0) = 0.45 


Hence X and Y are not independent. 


3.13. Consider an experiment of drawing randomly three balls from an urn containing two red, three 
white, and four blue balls. Let (X, Y) be a bivariate r.v. where X and Y denote, respectively, the 
number of red and white balls chosen. 


(a) Find the range of (X, Y). 

(b) Find the joint pmf’s of (X, Y). 

(c) Find the marginal pmf’s of X and Y. 
(d) Are X and Y independent? 


(a) The range of (X, Y) is given by 
Rxy = {(0, 9), (0, 1), (0, 2), (0, 3), (1, 0), (1, 1), (1, 2), (2, 0), (2, D} 
(6) The joint pmf’s of (X, Y) 
Pxyli, J) = P(X =i, Y =/) f=0,1,2 j=0,1,2,3 


are given as follows: 


rane roan (I/C)-8 
ranr-(Yi\()-2  rnr-(()=2 

miro) ron (IC)=E 
ma.ae()0)()- 

mea) C-& rman (I0YC)-8 


which are expressed in tabular form as in Table 3.1. 


(c) The marginal pmf’s of ¥ are obtained from Table 3.1 by computing the row sums, and the marginal 
pmf’s of Y are obtained by computing the column sums. Thus 


px(0) = # px(l)= 84 Px(2) = ui 
py(0) = ag py(l) = a py(2) = Bd Py(3) = wa 


Table 3.1 pxy(i, j) 
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3.14. 


3.15. 
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(d) Since 
pxv(0, 0) = ga # px(O)py(0) = 23 (32) 


X and Y are not independent. 


The joint pmf of a bivariate r.v. (X, Y) is given by 
k(2x,; + y,) x,=1,2;y,=1,2 
PxXX%, Y) = ‘o ; ; 


otherwise 


where k is a constant. 
(a) Find the value of k. 
(b) Find the marginal pmf’s of X and Y. 
(c) Are X and Y independent? 
(a) By Eq. (3.17), 
2 2 
» » PxylX;. ) = » » K(2x; + y;) 


xi OY; x= b yy=l 
=k(24+1)4+(24+2)4+(44+1)4+(44+2)] =k19 =1 
Thus, k = 7g. 
(b) By Eq. (3.20), the marginal pmf’s of X are 
2 
Px(X;) = XY PxyvlX, y;) = ¥ 7g (2x; + y;) 
yy yr 
= pg(2x; + 1) + py(2x, + 2) = Fy(4x; + 3) x, =1,2 


By Eq. (3.21), the marginal pmf’s of Y are 


2 
Prd) =D Pris ¥) = Lo te(2x; + y,) 
xt xi=l 
= T(2 + y)) + Tyl(4 + y) = Fa(2y; + 6) yy=l,2 
(c) Now py(x;)py(y,) # Pyy(x;. y,;); hence X and Y are not independent. 


The joint pmf of a bivariate r.v. (X, Y) is given by 
kx;7y; x, = 1,2; y,= 1, 2,3 

Parti» Jj) = {i otherwise 
where k is a constant. 
(a) Find the value of k. 
(b) Find the marginal pmf’s of X and Y. 
(c) Are X and Y independent? 
(a) By Eq. (3.17), 

2 3 
Y Y Pxyl(%is Y)) = YX YX kx)?y, 


Xi yj Mad yy dh 


=k1+24+34+4484 12) = k(30)=1 


Thus, k = 35. 
(b) By Eq. (3.20), the marginal pmf’s of X are 
3 
Pxlx;) = XY Pxyl%, y) = Y TX); = 3x7 x= 1,2 


Ii yal 
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By Eq. (3.21), the marginal pmf’s of Y are 
2 
Pyly;) = Y Pyylx;, y;) = Y Foxy; = ay; p= 1, 2,3 
Xi xi=1 
(c) Now 
Px(x))Py(y;) = 3X7; = PyylXi, y;) 


Hence X and Y are independent. 


3.16. Consider an experiment of tossing two coins three times. Coin A is fair, but coin B is not fair, 
with P(H) = 4 and P(T) = 3. Consider a bivariate r.v. (X, Y), where X denotes the number of 
heads resulting from coin A and Y denotes the number of heads resulting from coin B. 
(a) Find the range of (X, Y). 
(b) Find the joint pmf’s of (X, Y). 
(c) Find P(X = Y), P(X > Y), and P(X + Y < 4). 
) The range of (X, Y) is given by 
Ryy = {Jf =O, 1, 2, 3} 


(b) It is clear that the r.v.’s X and Y are independent, and they are both binomial r.v.’s with parameters (n, 
p) = (3, $) and (n, p) = (3, 4), respectively. Thus, by Eq. (2.36), we have 


3Vfiye 1 3\f1\3 
nareneeme()) =p ravenweane(\) 


1 3 

8 8 
mar=ror=2=(2)3) =3 pave =30=(;)(3) =§ 
Py(0) = P(Y = 0) = (3X3) G) =5 py(l) = POY = I) = (Ga) i = a 
manrver=(\2)()-2  rev-nr=9-(3\2) (2-2 


Since X and Y are independent, the joint pmf’s of (X, Y) are 
Pxvli, J) = PxfpXy) i, f = 9, I, 2,3 


which are tabulated in Table 3.2. 
(c) From Table 3.2, we have 


3 
P(X = Y) = ¥ pyyli, ) = st9(27 + 814+ 27+ D = its 
i=0 


3 
AX > Y) = ¥ pyyli, f) = Pyy(, 0) + Pyy(2, 0) + Pyy(3. 0) + pyyl2, 1) + pee, 1) + pye(3, 2) 


i>; 


= si(81 + 81 + 27 + 81 + 27+ 9) = BS 


Table 3.2 pyy(i, f) 
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P(X +Y>4)= Y pyvlis f) = Pxvl2, 3) + Pyy(3, 2) + Pxy(3, 3) 


itj> 
=shB4+94 =a 
499 


Thus, PX+y<4s=1-P(X+y>4=1-2435 = 98 


CONTINUOUS BIVARIATE RANDOM VARIABLES—PROBABILITY DENSITY 
FUNCTIONS 


3.17. The joint pdf of a bivariate r.v. (X, Y) is given by 
fod y) = k(x + y) 0<x<2,0<y<2 
x1 Y= Voy otherwise 
where k is a constant. 


(a) Find the value of k. 
(b) Find the marginal pdf’s of X and Y. 
(c) Are X and Y independent? 


(a) By Eq. (3.26), 
mo pe 2 72 
[ [ Sxyls, y) dx a= [ (x + y) dx dy 


2 x2 
=k (F+) 
[G 


2 
=k [e+ 2 ay=8e=1 
0 


x=2 
dy 


x=0 


Thus k = 4. 
(b) By Eq. (3.30), the marginal pdf of X is 


fea { 2 
SAX) = [ Suxlx, y) dy = a { (x + y) dy 


1 y? 
Tice >) 


Since fyy(x, y) is symmetric with respect to x and y, the marginal pdf of Y is 


0 otherwise 


me fer O<x<2 


y=0 


1 
_ Jay + O<y<2 
Sy) = {3 otherwise 


(c) Since fyy(x, y) ALAx) fy), X and Y are not independent. 


3.18. The joint pdf of a bivariate r.v. (X, Y) is given by 
Far Y= ia othes mse on 
where k is a constant. 
(a) Find the value of k. 
(b) Are X and Y independent? 
{c) Find P(X + Y < 1). 
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¥ y 
J J D> 
>» 
0 I 0 t 


{a} {>} 
Fig. 3-5 
(a) The range space Ryy is shown in Fig. 3-5(a). By Eq. (3.26), 


x a 1 ' ' 7" 
{ { farlx. y) dx dy = { {» dx a= « [5a 
“_— 0 4 bp \2 |e 


Thus k = 4. 
(b) To determine whether X and Y are independent, we must find the marginal pdf*s of X and Y. By Eq. 
(3.30), 
1 
4xy dy = 2x O<x<l1 
fle) = { 
0 otherwise 
By symmetry, 
_ j4y O<y<l 
Suv) = { otherwise 


Since fyy(x, y} = Ax) AW), X and Y are independent. 
(c) The region in the xy plane corresponding to the event (¥ + Y < 1) is shown in Fig. 3-5() as a shaded 


area. Then 
tT l-y 1 x? l-¥ 
pax ty<={ | by axdy = [o> Jay 
‘0 Jo 0 0 


1 
= { 4y(Hl -— VP] dy = 
0 
3.19, The joint pdf of a bivariate rv. (X, Y) is given by 


kxy O<x<y<l 
Far ¥) = ‘s otherwise 
where & is a constant. 


(a) Find the value of k. 
(b) Are X and Y independent? 


(a) The range space Ryy is shown in Fig. 3-6, By Eq. (3.26), 


ia ia 1 fy 1 fy? 
{ { fag ds dy =k | [ov azdy =x [O(5 
-e wow oO io iy 2 


1 3 
y k 
=k +dy=-=1 
{2 x8 
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Fig. 3-6 


Thus k = 8. 
(b) By Eq. (3.30), the marginal pdf of X is 
1 
8xy dy = 4x(1 — x?) O0<x<1 
f(x) = { 
0 otherwise 


By Eq. (3.31), the marginal pdf of Y is 
y 
[sx ax = ay" O<y<l 


0 
0 otherwise 


FAY) = 


Since fyy(x, y) # fy(x) f(y), X and Y are not independent. 
Note that if the range space R,y depends functionally on x or y, then X and Y cannot be indepen- 
dent r.v.’s. 


3.20. The joint pdf of a bivariate r.v. (X, Y) is given by 
k O<y<x<l 
Fal y) = ‘0 otherwise 
where k is a constant. 


(a) Determine the value of k. 

(b) Find the marginal pdf’s of X and Y. 

(c) Find PO < X <4,0< Y <4), 

(a) The range space Ryy is shown in Fig. 3-7. By Eq. (3.26), 


[- [ Seyplx, yy dx dy=k {| dx dy =k x area(Ryy) = k(4) = 1 


Rxy 


Thus k = 2. 
(b) By Eq. (3.30), the marginal] pdf of X is 


2 dy = 2x O<x<l 
Fx(X) = { 


0 otherwise 
By Eq. (3.31), the marginal pdf of Y is 


1 
fly) = [[2a=20-» O<y<l 


0 otherwise 
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(c) 
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Fig. 3-7 


The region in the xy plane corresponding to the event (0 < X¥ < 4,0 < Y < 4) is shown in Fig. 3-7 as 
the shaded area R,. Then 


PO<X <4,0<¥<}=P0<X¥ <4,0<Y<X) 
= {| Seylx, y) dx dy = 2 {| dx dy = 2 x area(R,) = 2(4) = 4 
R, Rs 
Note that the bivariate r.v. (X, Y) is said to be uniformly distributed over the region Ryy if its pdf is 


k (x, y) € Ryy 


Sur y) = \t (3.96) 


otherwise 


where k is a constant. Then by Eq. (3.26), the contant k must be k = 1/(area of Ryy). 


Suppose we select one point at random from within the circle with radius R. If we let the center 
of the circle denote the origin and define X and Y to be the coordinates of the point chosen (Fig. 
3-8), then (X, Y) is a uniform bivariate r.v. with joint pdf given by 


x? + y? < R? 


k 
Suv, y) = \b x? +4 y? > R2 


where k is a constant. 


(a) 
(b) 
(c) 


Determine the value of k. 
Find the marginal pdf’s of X and Y. 


Find the probability that the distance from the origin of the point selected is not greater 
than a. 


Fig. 3-8 
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(2) By Eq. (3.26), 


[ . [- Sxys y) dx dy =k {| dx dy = k(nR’) = 1 


x2+yl<R2 
Thus k = 1/xR?. 
(b) By Eq. (3.30), the marginal pdf of X is 
1 JRT= x2) 4 
=; dy=—5 /R?-x? x? < R? 
Axe) nR? » J(RT~x7) y nR? * * 
2 
_——_ R?2 _ 2 < 
Hence P00 = (aR? xT SR 
0 |x} >R 


By symmetry, the marginal pdf of Y is 
Sy) = aR? 


(c) ForO<a<R, 


P(X? 4+ Y¥? <a)= {| Srv(x, y) dx dy 


x2+y2 sal 


1 nae a? 
=—~ d -_wS 
mR? {| x dy mR? = R? 


x2 y2sa2 


3.22. The joint pdf of a bivariate r.v. (X, Y) is given by 


fork y) ke (ex * by) x>0,y>0 
x, = . 
xs J 0 otherwise 


where a and b are positive constants and k is a constant. 


(a) Determine the value of k. 
(b) Are X and Y independent? 


| | Spx, y) dx dy =k | | e742”) dy dy 
a dow b do 


a fee) k 
-«[ em ax [ ee dy=—=1 
qi 0 ab 


(a) By Eq. (3.26), 


Thus k = ab. 
(b) By Eq. (3.30), the marginal pdf of X is 


ee 


e™dy=ae™ x>0 


Sx) = abe ™ | 


ie] 


By Eq. (3.31), the marginal pdf of Y is 


fy) = ae | e %* dx = he ™* y>0 


a 


Since fy (x. y) = fr00 f(y), X and Y are independent. 
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3.23. A manufacturer has been using two different manufacturing processes to make computer 
memory chips. Let (X, Y) be a bivariate r.v., where X denotes the time to failure of chips made 
by process A and Y denotes the time to failure of chips made by process B. Assuming that the 
joint pdf of (X, Y) is 


abe 7 (9** 4») x > 0, y>0 
Sxyl%, y) = ‘0 otherwise 


where a = 10-4 and b = 1.2(107 *), determine P(X > Y). 


The region in the xy plane corresponding to the event (X > Y) is shown in Fig. 3-9 as the shaded area. 
Then 


ux > ¥) = ab [ 


0 


= ab { e* I ey iy| dx =a { e7 (1 — ee *) dx 
lo 0 0 


b 1.2(1074) 


“ [ew dy ax 
0 


= —— = ——*____ = 0545 
a+b (1+ 1.2K10~4) 
y 
ysx 
xX 
0 
Fig. 3-9 


3.24. A smooth-surface table is ruled with equidistant parallel lines a distance D apart. A needle of 
length L, where L < D, is randomly dropped onto this table. What is the probability that the 
needle will intersect one of the lines? (This is known as Buffon's needle problem.) 


We can determine the needle’s position by specifying a bivariate r.v. (X, ©), where X is the distance 
from the middle point of the needle to the nearest parallel line and © is the angle from the vertical to the 
needle (Fig. 3-10). We interpret the statement “the needle is randomly dropped” to mean that both X and © 
have uniform distributions and that X and © are independent. The possible values of X are between 0 and 


Fig. 3-10 Buffon’s needle problem. 
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D/2, and the possible values of © are between 0 and 2/2. Thus, the joint pdf of (X, ©) is 


4 


frol® 0) = fx) fol) = xD 
0 


otherwise 


0<xs5,0<65 


[CHAP 3 


From Fig. 3-10, we see that the condition for the needle to intersect a line is X < L/2 cos 6. Thus, the 


probability that the needle will intersect a line is 


L nf2 (L/2)cos 0 
P(x < Z00s 8) = { { 
2 0 ‘0 


Svolx, 9) dx dé 


4 m2 (L/2)cos 0 
SPC 
4 fb 2L 
=> [ 5 0086 d§ = “D 
CONDITIONAL DISTRIBUTIONS 
3.25. Verify Eqs. (3.36) and (3.47). 
(a) By Egs. (3.33) and (3.20), 
x Pyx(x;, yj) 
Y Prin) x) = 22 = Pad 
yj x(x) Px(x;) 
(b) Similarly, by Eqs. (3.38) and (3.30), 
© Sexlx, y) dy fl 
aoe Ny 
im O19) G8 
3.26. Consider the bivariate r.v. (X, Y) of Prob. 3.14. 


(a) Find the conditional pmf’s py,x(y;|x,) and pyyy(x;| y))- 
(b) Find P(Y = 2|X = 2) and P(X = 2) Y = 2). 


(a) From the results of Prob. 3.14, we have 


Pxy(X, y,) = , 


Ta(2x; + y;) 


otherwise 


Px(x;) = Ta(4x; + 3) x, =1,2 
Pyly;) = qal2y; + 6) yer 1,2 


Thus, by Eqs. (3.33) and (3.34), 


2x; + yj 
Prix sl) = yp 1,2;x,;=1,2 
2x; + yy 
lyj=——! (= 1,2; y,=1,2 
Px l Yi) 2y, +6 xj yj 


(b) Using the results of part (a), we obtain 


PLY = 2|X = 2) = pyy(2]2) = 


P(X = 2) ¥ = 2) = pyy(2|2) = 


2(2) + 2 
4(2) + 3 
2(2) + 2 
2(2) + 6 


6 


1 
3 
5 


1 


x)= 1,2; y,= 1,2 
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3.27. Find the conditional pmf’s py,(y;|_x,) and py,y(x;| y;) for the bivariate r.v. (X, Y) of Prob. 3.15. 


From the results of Prob. 3.15, we have 


1.2 
wrx Y = 2,y,=1,2,3 
Xi Vi) = . 
Prrl%i» Y;) {; otherwise 


Thus, by Eqs. (3.33) and (3.34), 


XY, 
Py\x(¥j| Xi) = =~ = 4y; y,=1,2,3;x,=1,2 
aX 
x7; 
pyxily) = atx? x= 1,259,123 
6yj 


Note that Py;x(y;|x,) = py(y,;) and pyjy(x;| yj) = Px(x), as must be the case since X and Y are independent, 
as shown in Prob, 3.15. 


3.28. Consider the bivariate r.v. (X, Y) of Prob. 3.17. 
(a) Find the conditional pdf’s fy,x(y| x) and fyyy(x | y). 
(b) Find PO < Y <4|X =1). 


(a) From the results of Prob. 3.17, we have 


1 
_Sax+y) O<x<2,0<y<2 
Suylx, nei otherwise 
G=AK+ I O<x<2 


)=al 
fp=av+  O<y<2 
Thus, by Eqs. (3.38) and (3.39), 


_ ax ty) Ilxty 
furl 1X) = To = 


O0<x<2,0<y<2 


(+1) 
fryy(x|y) = 22H Let 


= 0<x<2,0< 2 
ly+l) y+ * ys 


(b) Using the results of part (a), we obtain 


1/2 j 1/2 l+y 5 
P(O 1,X =1)= =l)=- —_|dy=— 
(0< Y<3| ) { Fyyxly (x = 1) | ( 5) ) Y=35 


3.29. Find the conditional pdf’s fy x(y| x) and fy, (x | y) for the bivariate r.v. (X, Y) of Prob. 3.18. 
From the results of Prob. 3.18, we have 
_ jaxy O0<x<1O0<y<l1 
Sar y) = ‘0 otherwise 
Fy(x) = 2x O0<x<1 
fhy=2y  O<y<! 


Thus, by Eqs. (3.38) and (3.39), 


4 


Fyixly | x) = ~Y = 2y O<y<10<x<1 


x 
4xy 
Sal ¥) = ZO = 2x O<x<ld<y<l 
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Again note that fy;x(v 1x) = frly) and fyyy(«| y) = f(x), as must be the case since X and Y are independent, 
as shown in Prob. 3.18. 


3.30. Find the conditional pdf's fy;x(y| x) and fy,y(x| y) for the bivariate r.v. (X, Y) of Prob. 3.20. 
From the results of Prob. 3.20, we have 


2 O<y<x<l 
Saris ¥) = ‘0 otherwise 
fx(x} = 2x O<x<l 


fpy=Al—y) O<y<l 
Thus, by Eqs. (3.38) and (3.39), 


1 
fnxlylX) => ysx<lO<x<l 


I 
Surly) = 7 ysx<1Q<x<1l 


3.31. The joint pdf of a bivariate r.v. (X, Y) is given by 


1 
—e@ ery x>O0,y>0 
Sav, 9) =4y » 

0 otherwise 
(a) Show that fy,(x, y) satisfies Eq. (3.26). 
(b) Find P(X > 1| Y = y). 


(a) We have 
be) oc wo “| 
| | fobs, vax dy = [ { ~ ee"? dx dy 
-a d-ax lio vo 


(b) First we must find the marginal pdf on Y. By Eq. (3.31), 


oo 


os | 
Sy(y) -| Sarl, y) dx = y ev | eo) dx 


0 

x =o 
=e? 

x=0 


1 
Ser y)_ J-e*”  x>0,y>0 


1 | _ 
=-e? —ye xy 
y 


By Eq. (3.39), the conditional pdf of X is 


Sxiy(1y) = 
fry) 0 otherwise 
Then P(X >1|/¥=y)= | Firs, y) dx = | y e* dx 
1 1 
= —¢e * a = ely 
x=] 
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COVARIANCE AND CORRELATION COEFFICIENTS 
3.32. Let (X, Y) be a bivariate r.v. If X and Y are independent, show that X and Y are uncorrelated. 
If (X, Y) is a discrete bivariate r.v., then by Eqs. (3.43) and (3.22), 
E(XY) = VY. XY Palis¥) = OY Xi VjP PY) 


yy Xi vp OX 


= b spats) | 5 vp) | = E(X)E(Y) 
xi yy 


If (X, Y) is a continuous bivariate r.v., then by Eqs. (3.43) and (3.32), 


E(XY)= | | . XV fxylx, y) dx dy = | | ; xpfylo fyly) dx dy 


= | “f(x dx | y fly) dy = E(X)E(Y) 


” ~ 


Thus, X and Y are uncorrelated by Eq. (3.52). 


3.33. Suppose the joint pmf of a bivariate r.v. (X, Y)is given by 


(0, 1), (1, 9), (2, 1) 


5 
Pxv(%i> ¥) = ‘5 otherwise 


(a) Are X and Y independent? 
(b) Are X and Y uncorrelated? 
(a) By Eq. (3.20), the marginal pmf’s of X are 
px(0) = » Pxy(0, y= Pxy0, I) = 4 
¥y 


px) = » Pxv(Q, ¥) = Pyy(l, 0) = 3 


Px(2) = » Pxyl2, ¥) = Pyy(2, 1) = 3 


y} 


By Eq. (3.21), the marginal pmf’s of Y are 
py(0) = x Pxylx;, 0) = pxy(1, 0) = 3 
x 


py1) = Pxylxis 1) = pyylO, 1) + pyyl2, 1) = 3 


and Pxy(0, 1) = 4 # pyx(O)p Al) = 5 
Thus X and Y are not independent. 
(b) By Eqs (3.45a), (3.456), and (3.43), we have 


E(X) = ¥. x; pylx;) = OX4) + (14) + (204) = I 
E(Y) = ¥ y;p(y) = OMA) + (13) = 3 
y) 


E(XY) = >» >» X; Yj Pxyl%is Yj) 


yy ME 
= (OK 14) + (LONG) + (2X4) = F 
Now by Eq. (3.5/), 
Cov(X, Y) = E(XY) — E(X)E(Y) = % — (1)(3) = 0 
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Thus, X and Y are uncorrelated. 


Let (X, Y) be a bivariate r.v. with the joint pdf 


2 2 
XO tet y2y2 


-O<x<0,-O<y<aO 
4x 


fx y) = 


Show that X and Y are not independent but are uncorrelated. 
By Eq. (3.30), the marginal pdf of X is 


1 mw 
Sxl) = 5 | (x? + yen? dy 


e772 dy + [ a 2e- 37/2 iy) 


e772 Ba 1 
= -——— [ x? y 
2./2n -y 2a -» V2n 
Noting that the integrand of the first integral in the above expression is the pdf of N(Q; 1) and the second 
integral in the above expression is the variance of N(Q; 1), we have 


(x? + Dew??? —-o<x<0 


Sl) = 


! 
2./2n 


Since fyy(x, y) is symmetric in x and y, we have 


(y? + New? —-nm<y<n 


1 
Fy) = 
e 2/2n 
Now fxy(x, y) # Sx(x) f(y), and hence X and Y are not independent. Next, by Eqs. (3.47a) and (3.47b), 
E(X) = | xfy(x) dx = 0 


oy 


E(Y)= | yfy(y) dy = 9 


7% 


since for each integral the integrand is an odd function. By Eq. (3.43), 


E(XY) = | | xVfxy(x, y) dx dy = 0 


The integral vanishes because the contributions of the second and the fourth quadrants cancel those of the 
first and the third. Thus, E(X Y) = E(X)E(Y), and so X and Y are uncorrelated. 


Let (X, Y) be a bivariate r.v. Show that 
LE(XY)]}* < E(X?)E(¥*) (3.97) 
This is known as the Cauchy-Schwarz inequality. 


Consider the expression E[(X — «Y)?] for any two r.v.s X and Y and a real variable «. This expres- 
sion, when viewed as a quadratic in «, is greater than or equal to zero; that is, 


E[(X — #Y)] > 0 
for any value of «. Expanding this, we obtain 
E(X?) — 2aE(X Y) + «E(Y?) > 0 
Choose a value of « for which the left-hand side of this inequality is minimum, 


_ E(XY) 
~ E(Y?) 


a 
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which results in the inequality 


; 2 
LEXY)" C9 or = [E(X Y)]° < E(X’)E(Y’) 


E(X?) — AY) = 


3.36. Verify Eq. (3.54). 
From the Cauchy-Schwarz inequality [Eq. (3.97)], we have 
{E[(X — wy ¥ — py)]}? Ss EL(X — py) JELLY = py)’ 


or Ixy? < 0x7 ay" 
2 
Oxy 
Then Py = <1 
Ox dy 


Since pyy is a real number, this implies 


lpxyl s | or —1< pyy <1 


3.37. Let (X, Y) be the bivariate r.v. of Prob. 3.12. 
(a) Find the mean and the variance of X. 
(b) Find the mean and the variance of Y. 
(c) Find the covariance of X and Y. 
(d) Find the correlation coefficient of X and Y. 
(a) From the results of Prob. 3.12, the mean and the variance of X are evaluated as follows: 
E(X) = Y x; pyxlx) = (00.5) + (1)(0.5) = 0.5 
E(X*) = y x," px(Xx;) = (0)°0.5) + (1)7(0.5) = 0.5 
y= E(X?) — [E(X)]? = 0.5 — (0.5)? = 0.25 

(6) Similarly, the mean and the variance of Y are 


E(Y) = ¥. y;pyly,) = (000.55) + (140.45) = 0.45 


yy 


E(Y?) =D) y?py(y,) = (0)7(0.55) + (1)7(0.45) = 0.45 


oy) 
ay” = E(Y*) — [E(Y)]? = 0.45 — (0.45)? = 0.2475 
(ce) By Eq. (3.43), 
E(XY) = Y Y Xi VjPxy(%, y;) 


vy Xi 
= (0X0}0.45) + (0)(1)(0.05) + (t)(O)(O.1) + (1)(1)(0.4) 
= 04 
By Eq. (3.51), the covariance of X and Y is 
Cov(X, Y) = E(XY) — E(X)E(Y) = 0.4 — (0.50.45) = 0.175 
(d) By Eq. (3.53), the correlation coefficient of X and Y is 
Cov(X, Y) _ 0.175 


xy = = —_ 
uw axsy ——/0.25X 0.2475) 


0.704 


3.38. Suppose that a bivariate r.v. (X, Y)is uniformly distributed over a unit circle (Prob. 3.21). 
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(a) Are X and Y independent? 
(b) Are X and Y correlated? 


(a) Setting R = 1 in the results of Prob. 3.21, we obtain 


xr+yr<] 
Says y) = 


coal 


xr+y?> 1 

fx) == JT {x} <1 
2 

fy)=— JI —y? ly <t 


Since fyy(x, y) #Sx(x) fy), X and Y are not independent. 
(b) By Egg. (3.47a) and (3.47b), the means of X and Y are 


1 

axy=2 | x./1 — x? dx =0 
-1 
1 

ay)=2| yl — y? dy =0 
-1 


since each integrand is an odd function. 
Next, by Eq. (3.43), 


BIXY) =~ {| xy dx dy=0 


x2ty2< 1 


The integral vanishes because the contributions of the second and the fourth quadrants cancel those of 
the first and the third. Hence, E(X Y) = E(X)E(Y) = 0 and X and Y are uncorrelated. 


CONDITIONAL MEANS AND CONDITIONAL VARIANCES 


3.39. Consider the bivariate r.v. (X, Y) of Prob. 3.14 (or Prob. 3.26). Compute the conditional mean 
and the conditional variance of Y given x; = 2. 


From Prob. 3.26, the conditional pmf py,,(y,| x,) is 


2x, + yj 
Prixl¥sl x) = Fo yp 1,2; x, = 1,2 
4+y; 
Thus, Pyix(y;1 2) = 7 yp=l,2 


and by Eggs. (3.55) and (3.56), the conditional mean and the conditional variance of Y given x, = 2 are 


4+y, 
Byj2 = E(Y |x; = 2) = Y Yj Pyjx(;| 2) = Y vf 7 ) 
vs YW 


aood(r- fn F-26- G2 
GV) 
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3.40. Let (X, Y) be the bivariate r.v. of Prob. 3.20 (or Prob. 3.30). Compute the conditional means 
E(Y |x) and E(X | y). 


From Prob. 3.30, 
l 
SyxWIX=> ySx<LO<x<! 
| 
Sxl) = ysx<l1lO<x<1 


By Eq. (3.58), the conditional mean of Y, given X = x, is 


” *f] y? |P=* x 
avix=| vari dy = [(4) dy = E = > 0<x<! 
—% oO \X 2X |y-0 2 
Similarly, the conditional mean of X, given Y = y, is 
a 1 1 x x= l+y 
E(X|y) = [F sentxin dx -| (; -.) dx = i my |e, =z (OO<y<! 


Note that E( Y |x) is a function of x only and E(X | y) is a function of y only. 


3.41. Let (X, Y) be the bivariate r.v. of Prob. 3.20 (or Prob. 3.30). Compute the conditional variances 
Var(Y |x) and Var(X | y). 


Using the results of Prob, 3.40 and Eq. (3.59), the conditional variance of Y, given X = x, is 
oc x 2 
Var( ¥ |x) = E{LY ~ E(¥|x)]* |x} = | (» - *) Sixty x) dy 


bam?) 


-P0-) Qe-xb-) 


Similarly, the conditional variance of X, given Y = y, is 


pox x 


>» 12 


y= 


> roe) I+y 2 
Var(X | y) = E{LX — E(X| y)]*|y} = XS | Sui ly) dx 


7 l (: 7 Ja ~ 3 S (s 7 +) 


N-DIMENSIONAL RANDOM VECTORS 


3.42. Let (X,, X,, X34, X4) be a four-dimensional random vector, where X, (k = 1, 2, 3, 4) are inde- 
pendent Poisson r.v.’s with parameter 2. 


(a) Find P(X, = 1, X, =3, X, = 2, X, = 1). 
(b) Find the probability that exactly one of the X,’s equals zero. 
(a) By Eq. (2.40), the pmf of X, is 


et (1 —yP 


12 


xy 


py li) = P(X, =) =e? > i=0,1,... (3.98) 


i} 
Since the X,‘s are independent, by Eq. (3.80), 
P(X, = 1, X, =3, X35 =2,X,= N= Px ())Px,(3)Px,(2)px,{1) 
e?2\fe-?25\(e~72?\fe7 22 e827 4 
-( 1! \( 3! \( 2! \ 1! )- iy © 35810") 
(6) First, we find the probability that X, = 0, k = 1, 2, 3, 4. From Eq. (3.98), 
P(X, =0)=e7? k=1,2,3,4 
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Next, we treat zero as “success.” If Y denotes the number of successes, then Y is a binomial r.v. with 
parameters (n, p) = (4, e *). Thus, the probability that exactly one of the X,’s equals zero is given by [Eq. 
(2.36)} 


P(Y =1)= (Hex —e 4 x 0,35 


3.43. Let (X, Y, Z) be a trivariate r.v., where X, Y, and Z are independent uniform r.v.’s over (0, 1). 
Compute P(Z > XY). 


Since X, Y, Z are independent and uniformly distributed over (0, 1), we have 
Preval, ¥, 2) =f OO fA Sz) = 1 O<x<1O<y<1LO<z<1 


1 a] 1 
Then PiZ>XY)= {| Tuvdlx, y, z) dx dy dz = | | | dz dy dx 
0 (0) xy 


z>xy 


pl i x 3 
= (= so) dy dx = | ( *) axe 
if » ; 2 4 


3.44. Let (X, Y, Z) be a trivariate r.v. with joint pdf 


ke7 (ax* byt ez) x>Qy>0,2>0 


Suydl% Vs 2) = ‘0 


otherwise 
where a, b, c > O and k are constants. 


(a) Determine the value of k. 

(b) Find the marginal joint pdf of X and Y. 
(c) Find the marginal pdf of X. 

(4) Are X, Y, and Z independent? 

(a) By Eq. (3.76), 


| | | Fevals, yy z) dx dy dz=k | [ [ “gc fax thy tea) dx dy dz 
x doa dex 0 Jo Jo 
Ld ax a by wm _ k 
=k e*%dx | e dy | e7% dz =—=1 
lo 0 0 abc 
Thus k = abe. 


(b) By Eq. (3.77), the marginal joint pdf of X and Y ts 


Saylx, y) = | fuy2lX, y. 2) dz = abc | eg axtbyren) dz 


0 
= abce “thy { e-% dz = abet» x>0,y>0 
0 
(c) By Eq. (3.78), the marginal pdf of X is 


| | faves y.2) dy de = abe | [fewer ay a 
a Jax lo 


0 
= abee [ er w| e%dz=ae * x>0 
0 0 
(d) Similarly, we obtain 


Sly) = | | tev yz) dxdz=be ™ — y>O0 
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3.45. 


fz) = i} i} Suvalx, y, 2) dx dy = ce z>0 


Since fyyz(x, y, 2) =SxO pW) f(z), X. Y, and Z are independent. 


Show that 
Sxv2l%, ys 2) = fax. (Z| X, y) fy | x) f(x) (3.99) 
By definition (3.79), 
_ Sxy2lX Ys 2) 
Sayx, lZ1% Y) Foyle, y) » 
Hence Suv2l 2) = Sax, ZW) Saybs ¥) (3.100) 


Now, by Eq. (3.38), 
Sur y) = Syv |X) Sx(%) 
Substituting this expression into Eq. (3./00), we obtain 


fay2lx, ¥, 2) = faix. (Z| Xx, PV Srp L(x) 


SPECIAL DISTRIBUTIONS 


3.46. 


3.47. 


Derive Eq. (3.87). 


Consider a sequence of n independent multinomial trials. Let A; (i = 1, 2, ..., k) be the outcome of a 
single trial. The r.v. X; is equal to the number of times A; occurs in the n trials. If x,, x, ..., X, are 
nonnegative integers such that their sum equals n, then for such a sequence the probability that A; occurs x; 
times, i = 1, 2,..., k—that is, P(X, =x,, X, = x,,..., X, = x,}--can be obtained by counting the number 
of sequences containing exactly x, A,’s, x, A,'s,..., X, A,’s and multiplying by p,*'p.*? +++ p,*. The total 
number of such sequences is given by the number of ways we could lay out in a row n things, of which x, 
are of one kind, x, are of a second kind, ..., x, are of a kth kind. The number of ways we could choose x, 


positions for the A,’s is ( ) after having put the A,’s in their position, the number of ways we could 
xy 


we _» f{n-x . : 
choose positions for the A,’s is ( ') and so on. Thus, the total number of sequences with x, A,’s, x2 
X2 


A,’S,..., X, A,’s 18 given by 
(eae eee (" -- Xy — XX — “) 
x, xX) xy Xy 
n! (n — x,)! (n—x, —X,--+°' — X,.,)! 
“x Ma ~ xy)! xan — x, -- x,)! x, 10! 
n! 
~ x, 1x,! no xy! 
Thus, we obtain 
n! x x xy 
Pxyxg ee xX X25 005 XH) = XIX! . 7 Pi 'P2 Pees pm 


Suppose that a fair die is rolled seven times. Find the probability that 1 and 2 dots appear twice 
each; 3, 4, and 5 dots once each; and 6 dots not at all. 


Let (X,, X2...., X 4) be a six-dimensional random vector, where X, denotes the number of times i dots 
appear in seven rolls of a fair die. Then (X,, X,,..., X,) is a multinomial r.v. with parameters (7, p,, p2, ---. 
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Pg) where p, = 3 (i = 1, 2, ..., 6). Hence, by Eq. (3.87), 


7 AW ARU ADU ADL ARW ADL 
PIX, = 2X2 =2 Xs= 1 Ne= le Xs = 1, X6= 9 =F aig :) 2) (2) (3) *) 6 


7! fl\’ 
=~ (3) = 3 x 0.0045 


6 


Show that the pmf of a multinomial r.v. given by Eq. (3.87) satisfies the condition (3.68); that is, 
DL D Prrxe xs X20 0%) = 1 (3.101) 


where the summation is over the set of all nonnegative integers x,, X.,..., X, whose sum is n. 


The multinomial theorem (which is an extension of the binomial theorem) states that 
n 
(a, ta,+-:: +a) =¥ ( Java." te ae (3.102) 
MyX_Q ry 
where x, +x, +°°' +x, =nand 


( n ) n! 
a ~ see! 
X 1X2 X, X,!x,! x,! 


is called the multinomial coefficient, and the summation is over the set of all nonnegative integers x,, x2, ..., 
x, whose sum is n. 
Thus, setting a; = p, in Eq. (3.102), we obtain 


yy Y Pryxa %1> Xo cero Xe) = (Pp + Pp Foo +P) HOY = 1 


Let (X, Y) be a bivariate normal r.v. with its pdf given by Eq. (3.88). 


(a) Find the marginal pdf’s of X and Y. 
(b) Show that X and Y are independent when p = 0. 
(a) By Eq. (3.30), the marginal pdf of X is 


Sx() = { Sxylx, y) dy 
From Eas. (3.88) and (3.89), we have 


1 1 
Sxy(x, y) = Ino, 0,1 — py exp] - 5 q(x, »| 


__! [(x=+r) x= ux\(y= pr) (y= Br) 
a= 7a Oy J = 20( Ox \ Oy )+( Oy y| 


Rewriting q(x, y), 
1 —y x—ux\)?  (x-4¥xVY 
Ax = (2 ‘) ~ of ‘)] * ( ; 
—p Oy ox ox 
1 Oy 2 x — ply 2 


1 (x — py) 
PL 3G ~ I I 
Then fx) = => exp] — 5 a(x, »] dy 


Jinx - © J dn ol — pi? 


1 G 2 
———5 | y — by — p (x — py) 
(1 — pay? yo tye ox * 


where g(x, y) = 
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Comparing the integrand with Eq. (2.52), we see that the integrand is a normal pdf with mean 


By + p(oy/ayXx — Hy) and variance (1 — p?)o,?. Thus, the integral must be unity and we obtain 
1 —(x — py)? 
Syl) = exp| Se] (3.103) 
J 2n ox 20x 


In a similar manner, the marginal pdf of Y is 


SY) = exp| UH | (3.104) 


(b) When p = 0, Eq. (3.88) reduces to 


_ 1 1[ fx -—py\) y—by\ 
Favs * Daye; expt —3[( Ox ) «( Oy y} 


=, |S") [eel a4) | 
eal 3( ox J | Janay Lk 2\ o% 
= fx SAY) 


Hence, X and Y are independent. 


3.50. Show that p in Eq. (3.88) is the correlation coefficient of X and Y. 
By Eqs. (3.50) and (3.53), the correlation coefficient of X and Y is 


oN) 
=[" [- | — eee ee y) dx dy 3.105) 


where fyy(x, y) is given by Eq. (3.88). By making a change in variables v = (x — py)/ox and w =(y — py)/oy, 
we can write Eq. (3./05) as 


rods 2n(1 — mina exal - a = py — 2pm tw | ava 


exp _ = pwr’ dupe ”!? dw 
=| oe error va PL — 3 pF 


The term in the curly braces is identified as the mean of V = N(pw; | — p?), and so 


e7%7/2 dw 


(pw)e™ aw=p | w? 


1 
Pxy = UE - Jim 


The last integral is the variance of W = N(Q; 1), and so it is equal to 1 and we obtain pyy = p. 


3.51. Let (X, Y) be a bivariate normal r.v. with its pdf given by Eq. (3.88). Determine E(Y | x). 


By Eq. (3.58), 
piv) = Wyyx(y |x) dy (3.106) 
where Syixy 1x) = om (3.107) 


Substituting Eqs. (3.88) and (3.103) into Eq. (3.107), and after some cancellation and rearranging, we obtain 


I 1 Gy 2 
Pax) pain OP ~ Fey =p) [7 ? 3, — Hx) — Hy 
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which is equal to the pdf of a normal r.v. with mean py + p(oy/o,xx — Hy) and variance (1 — p”)oy. Thus, 
we get 


E(Y |x) = by +p (~~ My) (3.108) 
x 


Note that when X and Y are independent, then p = 0 and E(Y | x) = py = E(Y). 


The joint pdf of a bivariate r.v. (X, Y) is given by 


1 
(x, y) = ex oe +y+x-2 +o —o<x,y< 00 
fy 2 Jin P| — 3 yty y y 


(a) Find the means of X and Y. 
(b) Find the variances of X and Y. 
(c) Find the correlation coefficient of X and Y. 


We note that the term in the bracket of the exponential is a quadratic function of x and y, and hence 
Sxy(x, y) could be a pdf of a bivariate normal r.v. If so, then it is simpler to solve equations for the various 
parameters. Now, the given joint pdf of (X, Y) can be expressed as 


1 
fay y= Jin exp| ~ 24(x, »| 
where ax, y) = 3x? —xy ty? tx - 2p 41) 
= 3[x? — x(y — + (y - 17) 


Comparing the above expressions with Eqs. (3.88) and (3.89), we see that fxy(x, y) is the pdf of a bivariate 
normal r.v. with zy = 0, wy = 1, and the following equations: 


nox oy/1 — p? = 2/3n 


(1 — p?)oy? = (1 — p?)oy? = 3 


Solving for oy’, ay”, and p, we get 


Hence 

(a) The mean of X is zero, and the mean of Y is 1. 
(b) The variance of both X and Y is 2. 

(c) The correlation coefficient of X and Y¥ is 4. 


Consider a bivariate r.v. (X, Y), where X and Y denote the horizontal and vertical miss dis- 
tances, respectively, from a target when a bullet is fired. Assume that X and Y are independent 
and that the probability of the bullet landing on any point of the xy plane depends only on the 
distance of the point from the target. Show that (X, Y) is a bivariate normal r.v. 


From the assumption, we have 


faylx, ¥) =O Sly) = glx? + y?) (3.109) 
for some function g. Differentiating Eq. (3.109) with respect to x, we have 
POOL AY) = 2xg(X? + y?) (3.110) 


Dividing Eq. (3.110) by Eq. (3./09) and rearranging, we get 
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fi) _ go? $y?) 
2xfelX) gle? + y?) 


Note that the left-hand side of Eq. (3.111) depends only on x, whereas the right-hand side depends only on 
x? + y*; thus 


(3.111) 


x(x) 
ae =¢ (3.112) 
x 
where c is a constant. Rewriting Eq. (3.112) as 
; d 
an = cx or ak [In f,()] = cx (3.113) 
x : 


and integrating both sides, we get 


In f(x) = ; xPtai or ffx) = ke??? 


where a and k are constants. By the properties of a pdf, the constant ¢ must be negative, and setting c = 
—1/o?, we have 
Sele) = ke 7100 
Thus, by Eq. (2.52), X = N(O; 0?) and 
] 
Sx(x) = 
* J/2ne 


In a similar way, we can obtain the pdf of Y as 


~ x2/(262) 


e 


1 


—yli(2ol) 


Sly) = 


¢ 
umes 


Since X and Y are independent, the joint pdf of (X, Y)is 


} e788 +2202) 
2no? 


Frys y) = fOD fx) = 


which indicates that (X, Y) is a bivariate normal r.v. 


Let (X,, X,,..., X,) be an n-variate normal r.v. with its joint pdf given by Eq. (3.92). Show that 
if the covariance of X,; and X, is zero for i # j, that is, 


CowX,, X) ia) (3.114) 
Vv io jJ= 6;; = . . . 
‘ 3 | (0) i # i 
then X,,X,,..., X, are independent. 
From Eq. (3.94) with Eq. (3.114), the covariance matrix K becomes 
o2 0 - 0 
0 a7 «. O 
K=] . mo, . (3.115) 
0 Oo O,7 
It therefore follows that 
|det K |? =a,0,-'- 6, = []o, (3.116) 


and 
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1 
+= 0: 0 
Oy 
1 
K's 0 4s 0 
O2 
1 
0 0 > 
Cn? 


Then we can write 


fn —~ un? 
(x — pK x -w= > (=) 
i=1 o 


Substituting Eqs. (3.116) and (3.118) into Eq. (3.92), we obtain 


1 n ; 
Figg ex ¥p9 0009 Xa) = exp] — 5 Y ( 
2 i=. g; 
(2m)” (I«.) 
Now Eq. (3.119) can be rewritten as 


Piey en xl Xt y Xq) = [] fx) 
i=1 


where 
l _ 
Sic (x) = 7 Em Haz 2a42) 
Vf 229; 
Thus we conclude that X,, X2,..., X, are independent. 


Supplementary Problems 


[CHAP 3 


(3.117) 


(3.118) 


(3.119) 


(3.120) 


Consider an experiment of tossing a fair coin three times. Let (X, Y) be a bivariate r.v., where X denotes the 


number of heads on the first two tosses and Y denotes the number of heads on the third toss. 


(a) Find the range of X. 
(b) Find the range of Y. 
(c) Find the range of (X, Y). 
(d) Find (i) P(X < 2, Y < 1); (i) P(X < 1, Y < 1); and (iii) P(X <0, Y < 0). 
Ans. (a) Ry = {0, 1, 2} 
(b) Ry = {0, 1} 


(c) Ryy = {(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1)} 
(2d) P(X <2,Y <S)I=1; (i) PIX <1,Y <1 


Let Fyy(x, y) be a joint cdf of a bivariate r.v. (X, Y). Show that 
P(X >a, Y >c) =1— F,fa) — Fy(c) + Fyy(a, c) 


where F(x) and Fy(y) are marginal cdf’s of X and Y, respectively. 


Hint: Set x, = 4, y, =c¢, and x, = y, = in Eq. (3.95) and use Eqs. (3.13) and (3.14). 


) = #; and (iii) P(X <0, Y <0)=4 
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3.57. 


3.58. 


3.59. 


3.60. 


3.61. 


Let the joint pmf of (X, Y) be given by 


k(x, + y)) x; = 1, 2,3; y;=1,2 
0 otherwise 


Pxylx;5 y)) = { 


where k is a constant. 


(a) Find the value of k. 
(b) Find the marginal pmf’s of X and Y. 


Ans. (a) k= 4h 
(b) py(x)) = 4y(2x,+3) = x, = 1, 2,3 


pry) = 76+ 3y,) yy = 1,2 


The joint pdf of (X, Y) is given by 


ke @*2y) x>0,y>0 


fuyl, y) = \e 


otherwise 
where k is a constant. 


(a) Find the value of k. 
(b) Find P(X > 1, Y <1), P(X < Y), and P(X < 2). 


Ans. (a) k=2 


(b) P(X >t, ¥ <t)se7! —e73 ©0318; P(X < Y)=4; P(X < 2) = 1 —e7* 20,865 
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Let (X, Y) be a bivariate r.v., where X is a uniform r.v. over (0, 0.2) and Y is an exponential r.v, with 


parameter 5, and X and Y are independent. 
(a) Find the joint pdf of (X, Y). 
(b) Find P(Y < X). 


25e°5) <x < 0.2, y>0 
Ans. (a) fares J) = {0 otherwise 


(b) PLY < X) =e"! = 0.368 


Let the joint pdf of (X, Y) be given by 


“aye >0,y>0 
fers y) = {n° xe? 
0 otherwise 


(a) Show that f,,x, y) satisfies Eq. (3.26). 
(b) Find the marginal pdf’s of X and Y. 


Ans, (b) fy(x)=e * x>0 
fy) = y>od 


(y + 1) 


The joint pdf of (X, Y) is given by 


kx*4-—y) x<y<2x,0<x <2 
0 otherwise 


Fux, y) = { 


where k is a constant. 


(a) Find the value of k. 
(b) Find the marginal pdf’s of X and Y. 


Ans, (a) k= 
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(b) f(x) = x74 — Fx) O<x<2 
(sy)za(4 — yy? O<y<2 

fel) = \G)5(4 -— WB Gy) 2<y<4 

0 otherwise 


The joint pdf of (X, Y) is given by 


_ fxye 2 x > 0, y > 0 
far, Y) = ‘0 otherwise 
(a) Find the marginal pdf’s of X and Y. 
(b) Are X and Y independent? 
Ans. (a) fxlx)=xe7V?? x > 0 

fdy)= ye"? oy > 0 

(b) Yes 
The joint pdf of (X, Y) is given by 
ee ty) x>OQy>0 
Sarl Y) = {6 otherwise 


(a) Are X and Y independent? 
(b) Find the conditional pdf’s of X and Y. 


Ans. (a) Yes 


(5) fayy(xty) = e* x>0 
Sydy |x) = eo” y>Oo0 


The joint pdf of (X, Y) is given by 
e” O<x<y 
Far») = ‘o otherwise 
(a) Find the conditional pdf’s of Y, given that X = x. 
(b) Find the conditional cdf’s of Y, given that X = x. 


Ans. (a) fyxlylx)=e™” yax 
0 
(6) Fyx(ylx) = { ys 


y2xX 


— exo? 


Consider the bivariate r.v. (X, Y) of Prob. 3.14. 
(a) Find the mean and the variance of X. 
(b) Find the mean and the variance of Y. 
(c) Find the covariance of X and Y. 
(d) Find the correlation coefficient of X and Y. 
Ans, (a) E(X) = 33, Var(X) = xa 

(b) E(Y) = 43, Var(Y) = sf 

(ce) Cov(X, Y)= -te 

(2) p= —0.025 


Consider a bivariate r.v. (X, Y) with joint pdf 
1 


5 @ 7 (88+ 979/202) 
2no 


Sarl, y) = 


—-o<xXy< 0 


Find P[(X, Y)|x? + )? <a’). 
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3.67. 


3.68. 


3.69. 


Ans. 1 — e7 7207) 


Let (X, Y) be a bivariate normal r.v., where X and Y each have zero mean and variance o*, and the 
correlation coefficient of X and Y is p. Find the joint pdf of (X, Y). 
1 1 x? — 2pxy + y? 
Ans. Y= Xs - = OS 
ns. Sxy(% y) 2no*(1 — p2)? r| 2 o (1 —p?) 


The joint pdf of a bivariate r.v. (X, Y) is given by 
! 
Sav, y) = Jin exe - 3(x? — xy + | 
n 


(a) Find the means and variances of X and Y. 
(b) Find the correlation coefficient of X and Y. 


Ans. (a) py = py = 0 0? =oy' =1 


(b+) p=4 


Let (X, Y, Z) be a trivariate r.v., where X, Y, and Z are independent and each has a uniform distribution 
over (0, 1). Compute P(X > Y > Z). 


Ans. % 


Chapter 4 


Functions of Random Variables, Expectation, 
Limit Theorems 


41 INTRODUCTION 


In this chapter we study a few basic concepts of functions of random variables and investigate the 
expected value of a certain function of a random variable. The techniques of moment generating 
functions and characteristic functions, which are very useful in some applications, are presented. 
Finally, the laws of large numbers and the central limit theorem, which is one of the most remarkable 
results in probability theory, are discussed. 


4.22 FUNCTIONS OF ONE RANDOM VARIABLE 
A. Random Variable g(X): 
Given ar.v. X and a function g(x), the expression 
Y = ((X) (4.1) 


defines a new r.v. Y. With y a given number, we denote Dy the subset of Ry (range of X) such that 
g(x) < y. Then 


(Y < y) = [o(X) < y] =(X € Dy) (4.2) 
where (X € Dy) is the event consisting of all outcomes ¢ such that the point X(¢) e Dy. Hence 
Fry) = P(Y < y) = Plo(X) Ss y] = P(X € Dy) (4.3) 


If X is a continuous r.v. with pdf f,(x), then 


Fly) = { Ix(x) dx (4.4) 
Dy 


B. Determination of f(y) from /,(x): 


Let X be a continuous r.v. with pdf fy(x). If the transformation y = g(x) is one-to-one and has the 
inverse transformation 


x=g ‘(y) = Aly) (4.5) 
then the pdf of Y is given by (Prob. 4.2) 


dh(y) 
dy 
Note that if g(x) is a continuous monotonic increasing or decreasing function, then the transfor- 


mation y = g(x) is one-to-one. If the transformation y = g(x) is not one-to-one, fy) is obtained as 
follows: Denoting the real roots of y = g(x) by x,, that is, 


SAY) = x(x) =fxLh)] (4.6) 


dx 
dy 


ya glx) == 904) = 0 (4.7) 
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then fi) = 2 as | 


where g’(x) is the derivative of g(x). 


43. FUNCTIONS OF TWO RANDOM VARIABLES 
A. One Function of Two Random Variables: 
Given two r.v.’s X and Y and a function g(x, y), the expression 


Z=@(X, Y) 
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(4.8) 


(4.9) 


defines a new r.v. Z. With z a given number, we denote D, the subset of Ry, [range of (X, Y)] such 


that g(x, y) < z. Then 
(Z < z)=[o(X, Y) <z] = {(X, Y) € Dz} 


(4.10) 


where {(X, Y) € Dz} is the event consisting of all outcomes { such that the point {X(Q), Y(Q)} € Dz. 


Hence 
F(z) = P(Z < z) = P[g(X, Y) < z] = P{(X, Y) € Dz} 


If X and Y are continuous r.v.’s with joint pdf f,,(x, y), then 


F(z) = {| Savlx, y) dx dy 
Dz 


B. Two Functions of Two Random Variables: 
Given two r.v.’s X and Y and two functions g(x, y) and h(x, y), the expression 


Z =9(X, Y) W =h(X, Y) 


(4.11) 


(4.12) 


(4.13) 


defines two new r.v.’s Z and W. With z and w two given numbers, we denote Dzy the subset of Ryy 


[range of (X, Y)] such that g(x, y) < z and A(x, y) < w. Then 
(Z<2z,W <w)=[9(X, Y) <2, h(X, Y) < w] = ((X, Y) € Day} 


(4.14) 


where {(X, Y) € Dzw} is the event consisting of all outcomes ¢ such that the point {X(Q), Y(Q} € Daw. 


Hence 
Fa,(z, w) = P(Z <z,W <w)= P[@(X, Y) <z, AX, Y) < w) 
= P{(X, Y) € Dzw} 


In the continuous case, we have 


Fzw(z, w) = Fuyl, y) dx dy 


Dzw 


Determination of fz yz, w) from fix, y): 
Let X and Y be two continuous r.v.’s with joint pdf f,y(x, y). If the transformation 
z= g(x, y) w = h(x, y) 
is one-to-one and has the inverse transformation 


x = q(z, w) y = r(z, w) 


(4.15) 


(4.16) 


(4.17) 


(4.18) 
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then the joint pdf of Z and W is given by 


Sawlz, w) = fyvlx, y) [IOs YI? (4.19) 
where x = q(z, w), y = r(z, w), and 
dg ag) [oz a 
0 
ge y= [OX OL & (4.20) 


ahah) ~ Jaw dw 
ox Oy ox ay 
which is the jacobian of the transformation (4.17). If we define 
0q  0q ox Ox 
oz dw oz Ow 


J(z, w) = ar ar = ay ay (4.21) 
0z aw 0z Ow 
then | J(z, w)| = |x, y)|7? (4.22) 
and Eq. (4.19) can be expressed as 
Sawlz, w) = fevL gz, w), (z, w)]| Jz, w)| (4.23) 


44 FUNCTIONS OF » RANDOM VARIABLES 


A. One Function of 2 Random Variables: 


Given nr.v.s X,,..., X, and a function g(x, ..., x,), the expression 
Y =@(X,,..., X,) (4.24) 
defines a new r.v. Y. Then 

(Y < y)=[o(X,,..., X) S vy] =[(X, ..., X,) € Dy] (4.25) 
and Fy(y) = PlglX,, ..., X,) < y] = POX, ..., X,) € Dy] (4.26) 
where Dy is the subset of the range of (X,, ..., X,) such that g(x,,..., x,)< y. If X,, ..., X, are 

continuous r.v.’s with joint pdf fy, ... x, (x4, -.., X,), then 
Fy) = (. ve i sexe vey Sq) xy oo AX, (4.27) 


B. «a Functions of 2 Random Variables: 


When the joint pdf of nr.v.’s X,,..., X,, is given and we want to determine the joint pdf of n r.v.’s 
Y,,.-., %, where 


Y, = 9i(X,,..-, X,) 


: (4.28) 
Y, = 9X), rere X,) 
the approach is the same as for two r.v.’s. We shall assume that the transformation 
Vi = G(X, ---, Xn) 
: (4.29) 


Vn = IAX 15 toes Xn) 
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is one-to-one and has the inverse transformation 
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X= AY, ---5 Ya) 
(4.30) 
Xn = hAYu rey Vn) 
Then the joint pdf of Y,,..., ¥, is given by 
ty; ™ y,(V i> weey yn) = fy, aa xnlX 1s trey Xn) F(X, arty x) I? (4.31) 
Ox, Ox, 
where | Cae oo eae (4.32) 
OG, «+» OG, 
Ox Ox, 
which is the jacobian of the transformation (4.29). 
45 EXPECTATION 
A. Expectation of a Function of One Random Variable: 
The expectation of Y = g(X) is given by 
Y glxipx(x,) (discrete case) 
E(Y) = E[g(X)] = (4.33) 
Ax) fx) dx (continuous case) 
B. Expectation of a Function of More than One Random Variable: 
Let X,,..., X, benrv.’s, and let Y = g(X,,..., X,). Then 
YY glx gs oes Mp dPxy eo g Mas os Xp) (discrete case) 
xf Xn 
E(Y) = Elg(X)] = 
| tee { HX yy vee XS ary oo Kg Xs o> Mp) UX, +7 dX, (continuous case) 
(4.34) 
C. Linearity Property of Expectation: 
Note that the expectation operation is linear (Prob. 4.39), and we have 
e( Y aX.) = )a,E(X)) (4.35) 
i=] i=1 
where a,’s are constants. If r.v.’s X and Y are independent, then we have (Prob. 4.41) 
ElgX)h(Y)] = Elo IETA(Y)) (4.36) 
The relation (4.36) can be generalized to a mutually independent set of n r.v’s X,,..., X,: 
dT 04x, | = T] Elg{X)] (4.37) 
i=1 i=l 


126 FUNCTIONS OF RANDOM VARIABLES, EXPECTATION, LIMIT THEOREMS [CHAP. 4 


D. Conditional Expectation as a Random Variable: 


In Sec. 3.8 we defined the conditional expectation of Y given X = x, E(Y | x) [Eq. (3.58)], which is, 
in general, a function of x, say H(x). Now H(X) is a function of the r.v. X; that is, 


A(X) = E(Y |X) (4.38) 
Thus, E(Y |X) is a function of the r.v. X. Note that E(Y | X) has the following property (Prob. 4.38): 
E(E(Y | X)] = E(Y) (4.39) 


4.6 MOMENT GENERATING FUNCTIONS 
A. Definition: 
The moment generating function of ar.v. X is defined by 
> e™px(x;,) (discrete case) 
M(t) = E(e*) = (4.40) 


ine) 
e'*fy(x) dx (continuous case) 


— co 


where ¢ is a real variable. Note that M,(t) may not exist for all r.v.’s X. In general, M(t) will exist 
only for those values of t for which the sum or integral of Eq. (4.40) converges absolutely. Suppose 
that M,(t) exists. If we express e'* formally and take expectation, then 


M,(t) = E(e*) = ef +1X 4+ , (XP +74 S (iX)* + | 


? t* 
= 1+ tB(X) + 5 EX?) +0 + EK) +o (4.41) 
and the kth moment of X is given by 
m, = E(X*) = My (0) k=1,2,... (4.42) 
dé 
where M,(0) = We M,(t) (4.43) 
t=0 


B. Joint Moment Generating Function: 
The joint moment generating function Myy,(t,, 2) of two r.v.’s X and Y is defined by 
Myy(ty, t>) = Efets*te2"] (4.44) 


where f, and f, are real variables. Proceeding as we did in Eq. (4.41), we can establish that 


foe) © + Ky n 
Molt ta) = Eee ery = YY ee BUXtY") (4.45) 
and the (k, n) joint moment of X and Y is given by 
Myq = E(X*Y") = Myy™"(0, 0) (4.46) 
oktn 
where Myxy""(0, 0) = =——— My, t2) (4.47) 
0 t,0"t, 1) =12=0 


In a similar fashion, we can define the joint moment generating function of nr.v.’s X,,..., X, by 


My, wo xg(tts soos by) = Elen + #9%0) (4.48) 
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from which the various moments can be computed. If X,,..., X, are independent, then 
Myce xgllas oes Og) = Ele to FX] Ele ®t oo + etn) 
= E(e't*!) .-- E(e™*") = My,(t,) ++: Mx,(t,) (4.49) 


C. Lemmas for Moment Generating Functions: 
Two important lemmas concerning moment generating functions are stated in the following: 


Lemma 4.1: If two rvs have the same moment generating functions, then they must have the same 
distribution. 


Lemma 4.2: Given cdfs F(x), F\(x), F2(x), ... with corresponding moment generating functions M(t), M,(b), 
M,(t),..., then F,(x) > F(x) if M,(t) > M(t). 


4.7 CHARACTERISTIC FUNCTIONS 
A. Definition: 
The characteristic function of a r.v. X 1s defined by 


¥ el@*ipy(x)) (discrete case) 
W (co) = Ele) =] (4.50) 


@* f(x) dx (continuous case) 
where w is a real variable and j = \/ —1. Note that ‘¥,(q@) is obtained by replacing t in M(t) by ja if 
M,(t) exists. Thus, the characteristic function has all the properties of the moment generating func- 
tion. Now 


|'P(a)| = | > ein y(x;) s > | e*ipx(x;)) | = > px(x) = 1 <a 
for the discrete case and 


|¥x(@)| = 


— o 


< [- | ef?* f(x) dx | = [" f(x) dx = 1< 00 


[’ ef f(x) dx 


for the continuous case. Thus, the characteristic function ‘Y,(@) is always defined even if the moment 
function M(t) is not (Prob. 4.58). Note that ‘P y(w) of Eq. (4.50) for the continuous case is the Fourier 
transform (with the sign of j reversed) of f,(x). Because of this fact, if Y¥y(w) is known, fy(x) can be 
found from the inverse Fourier transform; that is, 


ie a) 


l . 
Sx(x) = on { WP y(we-/°* daw (4.51) 
B. Joint Characteristic Functions: 


The joint characteristic function ¥ y(@),, @2) of two r.v.’s X and Y is defined by 


Y xy(@,, @2) = E[eltoXto2n) 
YY elo t ery xi, Va) (discrete case) 
ik 


(4.52) 


| | ellwixto2y f(x, y)dx dy (continuous case) 


where w, and w, are real variables. 
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The expression of Eq. (4.52) for the continuous case is recognized as the two-dimensional Fourier 
transform (with the sign of j reversed) of /yy(x, y). Thus, from the inverse Fourier transform, we have 


Sav ¥) om? [" [rotor wae Ht da, dws (4.53) 
From Eqs. (4.50) and (4.52), we see that 
¥x(@) = ‘Pxy(@, 0) Fy(@) = 'P xy(0, @) (4.54) 
which are called marginal characteristic functions. 
Similarly, we can define the joint characteristic function of n r.v.’s X,,..., X,, by 
Pc xg ys coy Oy) = ELelor ere + onkayy (4.55) 


As in the case of the moment generating function, if X,,..., X,, are independent, then 


Wy x @is --s On) = Py (@,) + By, (e,) (4.56) 


C. Lemmas for Characteristic Functions: 
As with the moment generating function, we have the following two lemmas: 
Lemma 4.3: A distribution function is uniquely determined by its characteristic function. 


Lemma 4.4: Given cdfs F(x), F,(x), F(x), ... with corresponding characteristic functions ‘Y(w), ¥,(@), ‘Y>(@), 
., then F,(x) > F(x) at points of continuity of F(x) if and only if ‘¥,(@) > ‘Y(@) for every a. 


48 THE LAWS OF LARGE NUMBERS AND THE CENTRAL LIMIT THEOREM 
A. The Weak Law of Large Numbers: 


Let X,,..., X, be a sequence of independent, identically distributed r.v.’s each with a finite mean 
E(X,) = uw. Let 


— 


] n 

are es p= —-(X, +++ + X,) (4.57) 
Then, for any ¢ > 0, 

lim P(| X, — pl > 6) = 0 (4.58) 


n+ x: 


Equation (4.58) is known as the weak law of large numbers, and X,, is known as the sample mean. 


B. The Strong Law of Large Numbers: 


Let X,,..., X, be a sequence of independent, identically distributed r.v.’s each with a finite mean 
E(X ) = pw. Then, for any ¢ > 0, 
P( im |X, ~ x1 >) =0 (4.59) 


where X, is the sample mean defined by Eq. (4.57). Equation (4.59) is known as the strong law of large 
numbers. 
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Notice the important difference between Eqs. (4.58) and (4.59). Equation (4.58) tells us how a 
sequence of probabilities converges, and Eq. (4.59) tells us how the sequence of r.v.’s behaves in the 
limit. The strong law of large numbers tells us that the sequence (X,,) is converging to the contant p. 


C. The Central Limit Theorem: 


The central limit theorem is one of the most remarkable results in probability theory. There are 
many versions of this theorem. In its simplest form, the central limit theorem is stated as follows: 

Let X,,..., X, be a sequence of independent, identically distributed r.v.'s each with mean p and 
variance a7. Let 


Zz Xt +X, Xia py 

" o./n a//n 

where X, is defined by Eq. (4.57). Then the distribution of Z, tends to the standard normal as n > «0; 
that is, 


(4.60) 


lim Z, = N(0; 1) (4.61) 
or 
lim F,,(z) = lim P(Z, < z) = ®(2) (4.62) 


where @(z) is the cdf of a standard normal r.v. [Eq. (2.54)]. Thus, the central limit theorem tells us 
that for large n, the distribution of the sum S, = X, + +--+ X,, is approximately normal regardless of 
the form of the distribution of the individual X,'s. Notice how much stronger this theorem is than the 
laws of large numbers. In practice, whenever an observed r.v. is known to be a sum of a large number 
of r.v.’s, then the central limit theorem gives us some justification for assuming that this sum is 
normally distributed. 


Solved Problems 


FUNCTIONS OF ONE RANDOM VARIABLE 

4.1. If X is N(u; ¢7), then show that Z = (X — p)/¢ is a standard normal r.v.; that is, N(O; 1). 
The cdf of Z is 

=H 


Fle) = uz <2) = 0(* <:)= PX S20 +0 


zoe \ 
= eo ORR M202 Ty 
~ ey 2no 


By the change of variable y = (x — y)/o (that is, x = oy + y), we obtain 


; ! 
Flc)= Pz <2) | —— eo dy 
7k 2n 
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4.2. 


4.3. 
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aF ,(z) ] 22/2 


dz J 2x 


and fz) = 


which indicates that Z = N(0; 1). 


Verify Eq. (4.6). 


Assume that y = g(x) is a continuous monotonically increasing function [Fig. 4-f(a)]. Since y = g(x) is 
monotonically increasing, it has an inverse that we denote by x = g™ '(y) = A(y). Then 


Fry) = P(Y < y)= P[X s hiy)] = Fy Lhy)] (4.63) 
d d 
and Sy) = dy Fy(y) = dy {FxLhQ)]} 
Applying the chain rule of differentiation to this expression yields 


d 
Sy) = AAO] D hy) 


which can be written as 


d 
fy) = fa) X= ALY) (4.64) 
y 
If y = g(x) is monotonically decreasing [Fig. 4.1(b)], then 
Fyly) = P(Y < y) = P[X > h(y)] = 1 — Fy LA(y)] (4.65) 
d d 
Thus, W= = F=f x= hy) (4.66) 


In Eq. (4.66), since y = g(x) is monotonically decreasing, dy/dx (and dx/dy) is negative. Combining Eqs. 
(4.64) and (4.66), we obtain 
dh(y) 


Sly) = SX) wy 


d 
b | = fefhty)] 


which is valid for any continuous monotonic (increasing or decreasing) function y = g(x). 


Let X be a r.v. with cdf Fy(x) and pdf fy(x). Let Y = aX + b, where a and b are real constants 
and a £0. 


(a) Find the cdf of Y in terms of Fy(x). 


y y 


(a) (b) 
Fig. 4-1 
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(a) (b) 
Fig. 4-2 


(b) Find the pdf of Y in terms of f;(x). 
(a) Ifa> 0, then [Fig. 4-2(a)] 


Fy) = P(Y <y)= PaX +b<y)= rf x < 1?) = FA? - *) (4.67) 


Ifa < 0, then [Fig. 4-2(6)] 
Fy) = P(Y < y)= Pax +b < y= Pax <y—)d) 


= a x > =) (since a <0, note the change 
a in the inequality sign) 


=1- #,(? 7 *) + of x = 1) (4.68) 


Note that if X is continuous, then PLX = (y — b)/a] = 0, and 
—b 
FY) =1- F(X) a<0 (4.69) 


(b) From Fig, 4-2, we see that y = g(x) = ax +b is a continuous monotonically increasing (a > 0) or 
decreasing (a < 0) function. Its inverse is x = g~'(y) = A(y) = (y — b)/a, and dx/dy = 1/a. Thus, by Eq. 
(4.6), 


—b 
Syly) = a =") (4.70) 


Note that Eq. (4.70) can also be obtained by differentiating Eqs. (4.67) and (4.69) with respect to y. 


4.4, Let Y = aX + b. Determine the pdf of Y, if X is a uniform r.v. over (0, 1). 
The pdf of X is [Eq. (2.44)] 

O<x<1 

otherwise 


five = {t 


Then by Eq. (4.70), we get 


y—b yeR, 


1 
1 — 
Sy) = Tal (=) = 4 ‘| (4.71) 


otherwise 
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roomy yoaxth 


a> a<Q 


Fig. 4-3 


The range Ry is found as follows: From Fig, 4-3, we see that 
For a> 0: Ry={y:b<y<a+b} 
Fora <0: Ry={y:ath<y<b} 


Let Y = aX + b. Show that if X = N(y; 07), then Y = N(au + b; a*o”), and find the values of a 
and b so that Y = N(O; 1). 


Since X = N(u; 07), by Eq. (2.52), 


Hence, by Eq. (4.70), 


y~—b , 
Sly) = exp} 3 3 [(% ) - | } 
J zelale 
op} 53 Tyigi ion + our} (4.72) 
ny, Jimlalo. 
which is the pdf of N(ayz + b; a?o?). Hence, Y = o (az + b; a*o). Next, let au + 6 = 0 and a?o? = 4, from 
which we get a = 1/o and b = ~u/o. Thus, Y = — p)/o is N(O; 1) (see Prob. 4.1). 


Let X be a rv. with pdf f,(x). Let Y = X?. Find the pdf of Y. 


The event A =(Y < y) in Ry is equivalent to the event B =(-Jy <X< /y) in Ry (Fig. 4-4). If 
y < 0, then 


and f(y) = 0. If y > 0, then 


Fyy) = PY < y= P(-V/y < X < V/y) = Fy /9) — Fl - J) (4.73) 


d d ad 1 
and fv) = Gy Fu) =F Fal — Fe Fal) = 5 F Ul) + ial- JV] 
1 
EUV +ft-VV yy > 0 
Thus, Syl) = 2/y Avy vy y> (4.74) 


0 y<0 
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Fig. 4-4 


Alternative Solution: 
If y < 0, then the equation y = x* has no real solutions; hence f,(y) = 0. If y > 0, then y = x? has two 
solutions, x, = Jy and x, = —/y. Now, y = g(x) = x? and g(x) = 2x. Hence, by Eq. (4.8), 


| . 
eM) + i— a) 0 
fuld) = 42/9 AIF HIV y> 


0 y<0 
4.7. Let Y = X?. Find the pdf of Y if X = N(; 1). 
Since X = N(O; 1) 


fx) = = eer 


Since f,(x) is an even function, by Eq. (4.74), we obtain 


] ] 
— fl/y) = en 32 0 
fi) = 4/9 ve 
0 y<0 


(4.75) 


48. Let ¥Y = X?. Find and sketch the pdf of Y if X is a uniform r.v. over (—1, 2). 
The pdf of X is (Eq. (2.44)] (Fig. 4-5(a)] 


~l<x<2 


i 
F(x) = {3 


otherwise 


In this case, the range of Y is (0, 4), and we must be careful in applying Eq. (4.74). When 0 < y < 1, both 
Jy and — Sy are in Ry = (—1, 2), and by Eq. (4.74), 


porate (tet= 
vy 2/y 3°3 3/y 


When { < y <4, ./yis in Ry = (—1, 2) but —\/y < —1, and by Eq. (4.74), 


1 1 1 
=——({-4090)={—~— 
fy) 2/y (; * 6./y 


1 
—_ O<y<l 
3./y 
Thus, fi =45_} l<y<4 
6./y 
0 otherwise 


which is sketched in Fig. 4-S(d). 
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Sy) 


ae) 


(a) (b) 
Fig. 4-5 


4.9. Let Y = e*. Find the pdf of Y if X is a uniform r.v. over (0, 1). 
The pdf of X is 


froo= O<x<1 


0 otherwise 
The cdf of Y is 
Fy) = P(Y < y) = Ple* < y)= P(X < Iny) 


In y In y 
-{ fy) ds = | dx =Iny l<y<e 
3 0 


d d 1 
Thus, SyO) = — Foy) = — Iny =- l<y<e 4.76 
WW) =F A ayy y (4.76) 
Alternative Solution: 


The function y = g(x)=e* is a continuous monotonically increasing function. Its inverse is 
x =g7'(y) = h(y) =Iny. Thus, by Eq. (4.6), we obtain 


1 
. d 1 = O0<Iny<1 
Sly) = fx(ln y) ” Iny |} = — fy(ln y) = Vy 
y y 0 otherwise 
i l<y<e 
or Sly) = Vy 
0 otherwise 


4,10. Let Y = e*. Find the pdf of Y if X = N(u; 07). 
The pdf of X is [Eq. (2.52)] 


1 1 
Be) = J2n0 exp| 20? a | "ese 


Thus, using the technique shown in the alternative solution of Prob. 4.9, we obtain 


1 ! it 
Sly) = y f,(ln y) = y/ino exp| ~ 552 (Iny — u| 0<y<a@ (4.77) 


Note that X = In Y is the normal r.v.; hence, the r.v. Y is called the log-normal r.v. 
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4.11. 


4.12. 


4.13. 


Let Y = tan X. Find the pdf of Y if X is a uniform r.v. over (— 2/2, 2/2). 


The cdf of X is [Eq. (2.45)} 


0 x<—n/2 
Fy(x) = (x + /2) —nf2<x <a/2 

l x > 1/2 

Now Fy(y) = P(Y < y) = PltanX < y) = P(X <tan™'y) 
= Fytan='y)=2(tannty + 2) = 54 2 taney -“e<y<0 
Then the pdf of Y is given by 
d 
fio) = FF) = a —-0o<y<o 


Note that the r.v. Y is a Cauchy r.v. with parameter 1. 


Let X be a continuous r.v. with the cdf Fy(x). Let Y = Fy(X). Show that Y is a uniform r.v. over 
(0, 1). 


Notice from the properties of a cdf that y = F(x) is a monotonically nondecreasing function. Since 
0 < F,(x) < 1 for all real x, y takes on values only on the interval (0, 1). Using Eq. (4.64) (Prob. 4.2), we 
have 


! 
=f) Be = BS = 


dF,(ojdx fy! O<y<l 


] 
SAY) = AO) dyldx 


Hence, Y is a uniform r.v. over (0, 1). 


Let Y be a uniform r.v. over (0, 1). Let F(x) be a function which has the properties of the cdf of a 
continuous r.v. with F(a) = 0, F(b) = 1, and F(x) strictly increasing for a < x < b, where a and b 
could be — co and oo, respectively. Let X¥ = F~1(¥). Show that the cdf of X is F(x). 


Fy(x) = P(X sx) = P[F-(Y) <x] 
Since F(x) is strictly increasing, F~'(Y) < x is equivalent to Y < F(x), and hence 
Fy(x) = P(X < x)= PLY < F(x)) 
Now Y is a uniform r.v. over (0, 1), and by Eq. (2.45), 
FA) =PYsyl=y O<y<l 
and accordingly, 
Fy(x) = P(X < x) = PLY < F(x)] = F(x) 0 < F(x) <1 


Note that this problem is the converse of Prob. 4.12. 
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4.14, Let X be a continuous r.v. with the pdf 


“* x>0 
x<0 


ful) = {6 


Find the transformation Y = g(X) such that the pdf of Y is 


1 
—_-= 0 1 
fdyy=s2fy ~~ ?* 


0 otherwise 
The cdf of X is 


[esa oo x>0 
F = 


Fy(x) -| Sg) de = 0 x<0 
_ 0 


Then from the result of Prob. 4.12, the rv. Z = | — e~* is uniformly distributed over (0, 1). Similarly, the 
cdf of Y is 


x | 
—= dn 0 
Fy) = ie ={y? <y<l 


0 0 otherwise 


and the rv. W = ./Y is uniformly distributed over (0, 1). Thus, by setting Z = W, the required transfor- 
mation is ¥Y =(1 — e~*)?. 


FUNCTIONS OF TWO RANDOM VARIABLES 


4.15. Consider Z = X + Y. Show that if X and Y are independent Poisson r.v.’s with parameters A, 
and /,, respectively, then Z is also a Poisson r.v. with parameter 4, + A. 


We can write the event 


(4 =n)=() (X=, Y=n-1 


i=O0 


where events (X =i, Y =n—ji),i=0, 1,..., n, are disjoint. Since X¥ and Y are independent, by Eqs. (1.46) 
and (2.40), we have 


PZ =n)=P(X+Y =n)= sy PIX =i, Y=n—)= ¥ P(X =)P(Y =n—3) 


i=0 1=0 
a i avi " ija-i 
=ye% Ay - a 42 = ei tan Ai Ay 
j=0 il (n— i! imo il(n — i)! 
(Artz) A t 
_& n. d ijant 
= ' 7 yp nt A2 
n! j=0 U(n — i}! 
e7 Ast Aa) 
= at 
=—— hi + 3) 


which indicates that Z = X + Y is a Poisson r.v. with 4, + A. 


4.16. Consider two r.v.’s X and Y with joint pdf fyy(x, y). Let Z = X + Y. 


(a) Determine the pdf of Z. 
(b) Determine the pdf of Z if X and Y are independent. 
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~ 


= 


Fig. 4-6 


+4 


(a) The range R, of Z corresponding to the event (Z < z) =(X + Y < z) is the set of points (x, y) which lie 
on and to the left of the line z = x + y (Fig. 4-6). Thus, we have 


Fj(z)=P(X+ Y¥<z= [’ [ [ft y) ‘| dx (4.78) 
d 2 fq fr 
Then Sez) = 5 Filz) = [" |< {" Sa y) ay| dx 
=| felx,z—x) dx (4.79) 


mee 


(b) If X and Y are independent, then Eq. (4.79) reduces to 
S,{2) = | Sl) fz — x) dx (4.80a) 


The integral on the right-hand side of Eq. (4.80a) is known as a convolution of f,(z) and f(z). Since the 
convolution is commutative, Eq. (4.80a) can also be written as 


fd0)= |" foie - 9 ay (4.806) 


4.17. Using Eqs. (4.19) and (3.30), redo Prob. 4.16(a); that is, find the pdf of Z = X + Y. 


Let Z=X+ Y and W =X. The transformation z = x+y, w=.x has the inverse transformation 
x=w,y=z~—w,and 


az dz 

ax dy 1 1 
TOM Tay aw -|; ei 

ox oy 


By Eq. (4.19), we obtain 
Sawh2, ¥) = fer, Z- w) 
Hence, by Eq. (3.30), we get 


fle) = [" Sawl2, ) dw = [’ Sapo, 2 =) dw = | fd, 2 — x) dx 


4.18. Suppose that X and Y are independent standard normal r.v.’s. Find the pdf of Z = X + Y. 
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The pdfs of X and Y are 


1 ; 1 
ful) =e? fly) = eee 


/2n /2n 
Then, by Eq. (4.80a), we have 


e | 


SAz) = [7 some —x)dx= [. Tin eo? oo eg F-API2 dy 


1 wm 


= e722 - dex+ 2x22 dx 
2n Jew 


Now, z? — 2zx + 2x? = (/2x ~ 2//2)? + 27/2, and we have 


| 2h | ° ~ (2x - 2f/2)2/2 
e ew dx 


1 1 


1 
= ~22jq 
Jin 23 Jom 


with the change of variables u = J2x - 2/./2. Since the integrand is the pdf of N(0; 1), the integral is 
equal to unity, and we get 


Sxlz) = 


en? dy 


~ 2a ~ 2242¢/2)2 


] 1 
WOT Ta a 


which is the pdf of N(O; 2). Thus, Z is a normal r.v. with zero mean and variance 2. 


4.19. Let X and Y be independent uniform r.v.’s over (0, 1). Find and sketch the pdf of Z = X + Y. 


Since X and Y are independent, we have 


1 O0<x<1O0<y<l 
0 otherwise 


Sal YHOO Sy) = | 
The range of Z is (0, 2), and 


F,(z)= P(X + Y <z)= {| Syylx, y) dx dy = il dx dy 


x+ysz x+ysz 


If0<z <1 [Fig. 4-7(a)], 


2 
Fz) = il dx dy = shaded area =5 


xtyp<z 
d 
and Sxl2) =~ Fz) =2 
dz 
fl <z< 2 [Fig. 4-7(5)], 
yz 
F(z) = {| dx dy = shaded area = | — (2 5 2) 
xty<z 
d 
and Sz) = = Ffz)=2-2z 
dz 
Zz O0<z<1 
Hence, Jz) =§2-—z l<z<2 


0 otherwise 
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(b) 
£2) 


which is sketched in Fig. 4-7(c). Note that the same result can be obtained by the convolution of f,{z) and 


Sy (z). 


4.20. Let X and Y be independent gamma r.v.’s with respective parameters (a, 1) and (8, 4). Show that 
Z =X +Y 1salsoa gammar.yv. with parameters (a + 8, A). 


From Eq. (2.76) (Prob, 2.24), 


Ae (Ax! 

Sex) = “Ta (@) x>0 
0 x<0 
dem *(AyP} 
————— 0 

fio=4 TB) a 
0 y<0 


The range of Z is (0, 00), and using Eq. (4.80a), we have 


fide) = rare { “he Md Me Mae — xP dx 

[0] 

qat8 z 

“Teore* { Oe ae 

‘0 

By the change of variable w = x/z, we have 
arp 1 
= ~azjatpn-i at: a -1 
Sz) Tore) ez { we (L — wo?) dw 


= ke Ftz2t bol 
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4.22. 
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where k is a constant which does not depend on z. The value of k is determined as follows: Using Eq. (2.22) 


and definition (2.77) of the gamma function, we have 


{ fz) dz =k { lr are 4 
- x 0 


k wo 
ae" Jo 
= Far8 (a + B) = 1 
Hence, k = 4**#/P(a + B) and 
2 Ta + B) T(a + B) 


which indicates that Z is a gamma r.v. with parameters (« + 8, ). 


Consider two r.v.’s X and Y with joint pdf fyy(x, y). Determine the pdf of Z = XY. 


z>0 


Let Z= XY and W =X. The transformation z = xy, w =x has the inverse transformation x = w, 


y = z/w, and 


ox Ox 

a An 0 1 

oz dw { 
Jie, ») = dy ay “jl zy 

a: Ow w w? 


Thus, by Eq. (4.23), we obtain 


fal “) 
Ww 
fal 2) dw 
Ww 


1 
Sawl2, w) = | ~~ 
Ww 


and the marginal pdf of Z is 


| 
w 


fao)~ | 


Let X and Y be independent uniform r.v.’s over (0, 1). Find the pdf of Z = XY. 


We have 


1 O<x<1O<y<1 
0 otherwise 


Sails, y) = { 
The range of Z is (0, 1). Then 


fu :) 1 O<w<l,0<2z/w<l 
W,—) = . 
ay w 0 otherwise 


1 O<z<we<l 


or fy (» z -{ 
mw) 10 otherwise 
By Eq. (4.82), 


r 
1 
fe)= | yaw = —lnz O0<z<! 


(4.81) 


(4.82) 
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4.23. 


4.24, 


4.25. 
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—Inz O<z<1 


Thus, . 
us 0 otherwise 


f,(z) = } 


Consider two r.v.’s X and Y with joint pdf f,,(x, y). Determine the pdf of Z = X/Y. 


Let Z = X/Y and W = Y. The transformation z = x/y, w = y has the inverse transformation x = zw, 
y = w, and 


Ox ax 
= dz dw wz 
MaWM=T 5 at =o ala” 
az aw 


Thus, by Eq. (4.23), we obtain 


Fawlz, w) = Jw fyylzw, w) (4.83) 


and the marginal pdf of Z is 


fol2) = | | wh Serlew, w) dw (4.84) 


Let X and Y be independent standard normal r.v.’s. Find the pdf of Z = X/Y. 


Since X and Y are independent, using Eq. (4.84), we have 


1 
jw __ ew + 22/2 dw 


S72) = [" Lw] Adzw) fw) dw = [’ 


a 


1 (* 1 f° 
=— we? +222 dw - — we wd +22y/2 dw 
2n Jo Qn Jo 


I 
=——, -e<z< 0 
nl + 2°) 


which is the pdf of a Cauchy r.v. with parameter |. 


Let X and Y be two rv.’s with joint pdf fyy(x, y) and joint cdf Fyy(x, y). Let Z = max(X, Y). 


(a) Find the cdf of Z. 
(b) Find the pdf of Z if X and Y are independent. 


(a) The region in the xy plane corresponding to the event {max(X, Y) < z} is shown as the shaded area in 


Fig. 4-8. Then 
F(z) = P(Z < z)= P(X <2, ¥Y <2) = Fyy(z, 2) (4.85) 
(b) If X and Y are independent, then 
F Az) = Fy2)F y(z) 
and differentiating with respect to z gives 
LA2) = fel2)F lz) + F ylz) fyl2) (4.86) 
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“e 


el 
ae 


“Fig. 4-8 


x 


ee 


4,26. Let X and Y be two r.v.’s with joint pdf fyy(x, y) and joint cdf Fyy(x, y). Let W = min(X, Y). 
(a) Find the cdf of W. 
(b) Find the pdf of W if X and Y are independent. 


(a) The region in the xy plane corresponding to the event {min(X, Y) < w} is shown as the shaded area in 
Fig. 4-9. Then 


PW <w) = P{(X <w) U(Y < w)} 
= P(X <w)+ PY <w)- PX <w)a(Y <w)} 


Thus, Fy(w) = Fy{w) + Fy(w) — Fy, w) (4.87) 
(b) If X and Y are independent, then 
Fy(w) = Fy(w) + Fy(w) — Fy(w)Fy(w) 
and differentiating with respect to w gives 


Sul) = Sx(w) + flv) — Sw) Fw) — Fy(w) ftw) 
=Sxw)[l ~ Fy(w)] + fd) [1 — Fx(w)] (4.88) 


4.27. Let X and Y be two r.v.’s with joint pdf fyy{x, y). Let 


= /X*+ Y? @ = tan"! = (4.89) 


Find feo(r, 9) in terms of fxy(x, y). 
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We assume that r > 0 and 0 < @ < 22. With this assumption, the transformation 


xp year tan! 2 = 9 
x 
has the inverse transformation 
x=rcos 6 y=rsin 0 
Since 
Ox Ox 


ar 00 cos@ -—rsin 6 

= =r 
oy éy sin 6 rcos 6 
or 00 


J(x, yy = 


by Eq. (4.23) we obtain 
Srolt. 9) = rfyfr cos 8, r sin 6) 


4.28. A voltage V is a function of time t and is given by 


V(t) = X cos wt + Y sin wt 
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(4.90) 


(4.91) 


in which @ is a constant angular frequency and ¥ = Y = N(0; o*) and they are independent. 


(a) Show that V(‘) may be written as 
V(t) = R cos (wt — O) 
(b) Find the pdf's of r.v.’s R and © and show that R and © are independent. 


(a) We have 
V(t) = X cos wt + Y sin wt 


= /Xi+¥? — —*_— sin ar) 
{x2 + y? /X2 + y2 

= ./X* + ¥*(cos © cos wt + sin © sin wt) 

= R cos(wt — ©) 


COs wt + 


Y 
where R=,/X? +4 Y? and © = tan"! X 


which is the transformation (4.89). 
(b) Since X = Y = N(O; a’) and they are independent, we have 


x. = a ~ (x2 + y2)/(202 
Srl > y) 3 7 @ (x2 + y2)4(202) 
s; using Eq. (4.90), we get 


Sralt, 9) = fyy(r cos 6, r sin 6) = — 5 eee?) 
dno 


2n r -, . 2n r “ante 
Now Sa(r) = [ Srel, 0) dé = Ino? e202) { ag = oi e202) 


i) 


” ] 
re~K2a2) yp — 
0 2n 


Sa(9) = ( Srolt, 9) dr = 
0 


2na? 


and fre(r, 8) = falr) fo(0); hence, R and © are independent. 
Note that R is a Rayleigh r.v. (Prob. 2.23), and © is a uniform r.v. over (0, 27). 


(4.92) 


(4.93) 


(4.94) 


(4.95) 
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4.29. Let X, Y, and Z be independent standard normal r.v.’s. Let W = (X? + ¥? 4+ Z*)'/?, Find the 
pdf of W. 


We have 


| 2+yl+z2 
far VW 2 = IO AW S(2) = (ni? ee eyes eae 


and Fy(w) = P(W < w) = P(X? + Y? + Z? < w’) 


= aa 2n) qe © en inre nse dx dy dz 
T) 


where Ry = {(x, y, z). x? + p? + 2? < w?), Using spherical coordinates (Fig. 4-10), we have 
etter? 
dx dy dz =r? sin @ dr d@ do 


and Fy(w) = a caf { [ “7/22 sin @ dr dO do 
=a mi ao |" sin 0.0 |e ld cae | 
a 
(2m) { re-P? dr (4.96) 
(2n)3/? lo 
Thus, the pdf of W is 
2 
d [Ewe w>0O 
Jig) = Fiy(e) = WW (4.97) 
0 w <0 


Fig. 4-10 Spherical coordinates. 


4.30. Let X,,..., X, be nm independent r.v.’s each with the identical pdf f(x). Let Z = max(X,, ..., X,). 
Find the pdf of Z. 


The probability P(z < Z < z + dz) is equal to the probability that one of the r.v.’s falls in (z, z + dz) 
and all others are less than z. The probability that one of X, (i= 1,..., n) falls in (z, z + dz) and all others 
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4.31. 


4.32. 


are all less than z is 


S (2) ae( |" L(x) ax) 


Since there are n ways of choosing the variables to be maximum, we have 


z avi 
Syhz) = nel | f(x) ax) = Mf (zZ)LF(2)]"" | (4.98) 


When n = 2, Eq. (4.98) reduces to 


S72) = 2f(2) { f(x) dx = 2f(2)F(2) (4.99) 


oe) 


which is the same as Eq. (4.86) (Prob. 4.25) with f,(z) = f(z) = f(z) and F(z) = F(z) = F(z). 


Let X,,..., X, be n independent r.v.’s each with the identical pdf f(x). Let W = min(X,, ..., X,). 
Find the pdf of W. 


The probability P(w < W < w + dw) is equal to the probability that one of the r.v.’s falls in (w, w + dw) 
and all others are greater than w. The probability that one of X, (i = 1, ..., n) falls in (w, w + dw) and all 


others are greater than w is 
oa an-l 
J(¥) aw({ f(x) ax) 


Since there are n ways of choosing the variables to be minimum, we have 


~ not 
Sal) = won | f(x) ax) = nf (2)[1 — F(z)" (4.100) 


w 


When n = 2, Eq. (4.100) reduces to 


a 


Jwlw) = 2f(w) { f(x) dx = 2f(w)[1 — F(w)] (4.101) 


which is the same as Eq. (4.88) (Prob. 4.26) with f,(w) = fw) = f(w) and Fy(w) = Fy(w) = F(w). 


Let X;,i=1,..., n, be n independent gamma r.v.’s with respective parameters (a;, 4), i= 1, -.., 
n. Let 


Y=X,+-:-+X,= ¥ X, 


i=1 
Show that Y is also a gamma tr.v. with parameters ()'7_, 4;, 4). 


We prove this proposition by induction. Let us assume that the proposition is true forn = k; that is, 


k 
Z=X,+--4+X,=¥ X, 
i=] 


k 
is a gamma r.v. with parameters (8, 4) =(¥° a;, a). 
i=) 


+1 
Let W=Z+X,.,= > X, 
i=) 


Then, by the result of Prob. 4.20, we see that W is a gamma r.v. with parameters (B+ a,.,, A)= 
(S**! a, 4). Hence, the proposition is true for n = k + 1. Next, by the result of Prob. 4.20, the proposition 


is true for n = 2, Thus, we conclude that the proposition is true for any n > 2. 
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Let X,,..., X, be n independent exponential r.v.’s each with parameter A. Let 
Y=X,+:::+X,= YD X, 
i= 
Show that Y is a gamma r.v. with parameters (n, A). 


We note that an exponential r.v. with parameter 4 is a gamma r.v. with parameters (1, A) (Prob. 2.24). 
Thus, from the result of Prob. 4.32 and setting «, = 1, we conclude that Y is a gamma r.v. with parameters 
(n, A). 


Let Z,,..., Z, be n independent standard normal r.v.’s. Let 
YeZ/?+t-3+Z,/=¥ Z? 
Find the pdf of Y. 
Let Y, = Z;?. Then by Eq. (4.75) (Prob. 4.7), the pdf of Y, is 
I e v2 


0 
Suv) = §/2ny ae 


0 y<0 


Now, using Eq. (2.80), we can rewrite 


dyn FeV A! $e *y/2)27! 
e@ = te 


/2ny Jn 7 T(4) 
and we recognize the above as the pdf of a gamma r.v. with parameters (4, 4) [Eq. (2.76)]. Thus, by the 
result of Prob. 4.32, we conclude that Y is the gamma r.v. with parameters (n/2, 4) and 


$e ¥2(y/2)v2- 1 eta yna at >0 
fuod={— Tn2)— PT (nf2y” (4.102) 
0 y<0 
When n is an even integer, T'(n/2) = [(n/2) — 1]!, whereas when n is odd, T(n/2) can be obtained from 
T(x) = (a — L)F(« — 1) (Eq. (2.78)] and r(4) = Jn [Eq. (2.80)]. 

Note that Equation (4.102) is referred to as the chi-square (y*) density function with n degrees of 
freedom, and Y is known as the chi-square (x7) r.v. with n degrees of freedom. It is important to recognize 
that the sum of the squares of n independent standard normal r.v.’s is a chi-square r.v. with n degrees of 
freedom. The chi-square distribution plays an important role in statistical analysis. 


Let X,, X,, and X, be independent standard normal r.v.’s. Let 


Y=X,+X,4+ X3 


Y, = X,—X2 
Y,=X,—X; 
Determine the joint pdf of Y,, Y,, and Y,. 
Let Vy =X, +X, 4X3 
Y2 =X, — XQ (4.103) 


¥3 = X2 — X3 
By Eq. (4.32), the jacobian of transformation (4.103) is 
1 1 1 


J(X,,X2,%3) =] 1 1 0|=3 
0 1! -1 
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Thus, solving the system (4.103), we get 
xX, = 4(y, + 2p. + ys) 
X_ = 4(¥1 — Yo + ys) 
X3 = $y; — y2 — 2ys) 
Then by Eq. (4.31), we obtain 
Vit 2y2+¥3 Vi Y2+¥3 Vi 27-2 
Sri Yo, Y3) = aan (BAB, a a a 


Since X,, X,, and X, are independent, 


3 
I - 
Sayxyx(X1, X25 X3) = HI Ix (xi) = Ome ge (x12 + x22 + x32)/2 
Hence ff ( \= | yan. v2. 992 
’ Yy¥2Y¥3 Yio Ya» V3) = 3(2n)? é 
+2y,+y;\? _ 2 -y,- 2 
where Qi, Y2> y3) = ——o + (4-24) + [ee] 


= wi + $y? + 3y3" + 32 V3 


EXPECTATION 

4.36. Let X bea uniform r.v. over (0, 1) and Y = e”. 
(a) Find E(Y) by using f/(y). 
(b) Find E(Y) by using fy(x). 
(a) From Eq. (4.76) (Prob. 4.9), 


1 

- l<y<e 
SAY) = Sy 

0 otherwise 

Hence, E(Y) = yily) dy = [ dy=e-1 
1 i 
(b) The pdf of X is 

1 O<x<1 

Sx) = {0 otherwise 


Then, by Eq. (4.33), 


2 i 
an= | eda) dx= [et dx=e— 1 
w 0 


4.37. Let Y =aX + b, where a and b are constants. Show that 
(a) E(Y) = E(aX + b) = aE(X) +56 
(bd) Var(Y) = Var(aX + b) = a? Var(X) 
We verify for the continuous case. The proof for the discrete case is similar. 


(a) By Eq. (4.33), 


E(Y) = E(aX + 6) = [ (ax + b) fy(x) dx 


7a 


=a [ xfy(x) dx + b [ Sy(x) dx = aE(X) + b 


oo 
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(4.105) 
(4.106) 
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(b) Using: Eq. (4.105), we have 


Var(Y) = Var(aX + b) = E{(aX + b — [aE(X) + 5))*} 
= Efa?[X — E(X)]?} = a E{(X — B(X)P} = a? Var(X) 


4.38. Verify Eq. (4.39). 


Using Eqs. (3.58) and (3.38), we have 
ELEY |X)} = | EY |x)fdx) dx = | i “hal ay on dx 


| [2 ow as w=[ ff fro dx | ay 
a ioe a Sy(x) - © a 


| -yfly) dy = ELY] 


4.39. Let Z = aX + bY, where a and 5 are constants. Show that 


4.40. 


E(Z) = E(aX + bY) = aE(X) + bE(Y) 


We verify for the continuous case. The proof for the discrete case is similar. 


x 


E(Z) = E(aX + bY) = | [ (ax + by) fyy(x, y) dx dy 


—% x 


=a | | xfyylx, y) dx dy + b | | Weyl, y) dx dy 


=a | d{ Jyylx, y) ay] dx +b | . oA Txvls y) ax| dy 


=a [; xfx(x) dx + b | . yh yly) dy = aE(X) + bE(Y) 


Note that Eq. (4./07) (the linearity of E) can be easily extended to ar.v.’s: 


a Sa x;) = 3a, BX) 
=i r= 


Let Y = aX +b. 
(a) Find the covariance of X and Y. 


(b) Find the correlation coefficient of X and Y. 


(a) By Eq. (4.107), we have 
E(X Y) = E(X(aX + 6)] = aE(X*) + bE(X) 
E(Y) = E(aX + b) = aE(X) +6 
Thus, the covariance of X and Y is [Eq. (3.5/)} 
Covn(X, Y) = ayy = E(XY) — E(X)E(Y) 

= aE(X*) + bE(X) — E(X)[aE(X) + 6] 

= al F(X?) — [E(X)]*} = ao,’ 
(b) By Eq. (4.106), we have oy = |a|oy. Thus, the correlation coefficient of X and Y is [Eq. (3.53)} 


Oxy acy’ a 1 a>o 
Pxy 


ByOy dylalay ja] —I a<0 


(4.107) 


(4.108) 


(4.109) 


(4.110) 
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4.41. Verify Eq. (4.36). 


Since X and Y are independent, we have 
E(g(X)h(Y)] = | | Ax)h(y)Sxy(x, y) dx dy 
= | . | alhly) fel) fyly) ax dy 


-{ Ax) f(x) ix| h(y) fry) dy 


-— om - 


= Elg(X)JELA(Y)] 


The proof for the discrete case is similar. 


4.42. Let X and Y be defined by 


X =cos © Y=sin@ 


where © is a random variable uniformly distributed over (0, 27). 


(a) 
(5) 


(a) 


(b) 


Show that X and Y are uncorrelated. 


Show that X and Y are not independent. 
We have 
0<0<2 
_ 8 
So(8) = \ 2x 
0 otherwise 
« 2n 1 2n 
Then E(X) = | xfy(x) dx = { cos 6 f,(0) dd = x { cos 6 d# =0 
— 0 0 
1 2n 
Similarly, E(Y)=— { sin 6 d6=0 
2n Jo 
5] 2n | 2n 
ene = | cos @ sin ad= | sin 20 d@ = 0 = E(X)E(Y) 
22 Jo 4n Jo 
Thus, by Eq. (3.52), X and Y are uncorrelated. 
> 1 an 3 j 2" 1 
E(X’) = — cos” 6 d@ = — (1 4+ cos 26) d@ = - 
2n Jo 4n Jo 2 
E(Y?) = 1/(* sin? 6 d@ = 4 a — cos 26) dd = 1 
~ 2a Jo ~ 4a Jo ~2 


1 2n t 2n } 
E(X? Y?) = — cos? @ sin® 6 dd =F | (J) — cos 46) dé = - 
20 Jo 16x Jo 8 
Hence 
E(X?Y’) = 4 #4 = E(X’)E(Y’) 
If X and Y were independent, then by Eq. (4.36), we would have E(X?Y?) = E(X*)E(Y’). Therefore, X 
and Y are not independent. 


4.43. Let X,,...,X, be nr.v.’s. Show that 


var{ $:4.X,) = y yaa, Cov(X;, Xj) (4.111) 


i=} i=1 j=i 
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If X,,..., X, are pairwise independent, then 
Var( > a; x,) = }a;? Var(X)) (4.112) 
i=. i= 
Let Y= Dax; 


f=. 


Then by Eq. (4.108), we have 


Var(Y) = E{LY — E(Y)}?} = EA( 5 ¥ aLX, - xin) } 


= By 5 » aa X;— E(X)ILX;— eux} 
= YY aay Bt0%, ~ BIE, ~ BOD) 
= x Ya a, Cov(X,, X) 
If X,,..., X, are pairwise independent, then (Prob. 3.22) 
Cov(X,, X) = haa iJ 
iéj 


and Eq. (4.111) reduces to 


var{ 5 a; x.) = x a? Var(X;) 


MOMENT GENERATING FUNCTIONS 
4.44. Let the moment ofa discrete r.v. XY be given by 
E(X*) = 0.8 k=1,2,... 
(a) Find the moment generating function of X. 
(b) Find P(X = 0) and P(X = 1). 
(a) By Eq. (4.41), the moment generating function of X is 


t? tt 
M,(t) =! + tE(X) + EK) +o + EK + 


re k 


t? a 
=1+4+0. = — = 8y — 
+oa(ee a ro -) 1+08) oy 


k! 
oo ras 
= 0.2 + 0.8 Pp ee 0.2 + 0.8e' (4.113) 
(b) By definition (4.40), 
M,(t) = E(e*) = ¥, e™py(x,) (4.114) 
i 


Thus, equating Eqs. (4.113) and (4.114), we obtain 
py(0) = P(X = 0) =02 px(1) = P(X = 1) = 08 
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4,45. 


4,46. 


4.47, 


Let X be a Bernoulli r.v. 
(a) Find the moment generating function of X. 
(b) Find the mean and variance of X. 
(a) By definition (4.40) and Eq. (2.32), 
M(t) = Ele") = L e*'px(X;) 
= ey (0) + ep (1) = (1 — p) + pe 
(b) By Eq. (4.42), 


E(X) = M;(0) = pe'| =p 
t= 0 
E(X?) = Mx(0) = pe'| =p 
1=0 
Hence, Var(X) = E(X?) — [E(X)]? = p — p? = p(l — p) 


Let X be a binomial r.v. with parameters (n, p). 

(a) Find the moment generating function of X. 

(b) Find the mean and variance of X. 

(a) By definition (4.40) and Eq. (2.36), and letting q = 1 — p, we get 
M,x(t) = Ee*) = ¥ ("par 


k=0 


(b) The first two derivatives of M,(t) are 
M x(t) = n(q + pet)" ‘pe! 
M xt) = n(q + pet "pe! + n(n — 1g + pel ?(pe'? 
Thus, by Eq. (4.42), 
Hy = E(X) = M,(0) = np 
E(X?) = Mx(0) = ap + n(n — t)p? 
Hence, ax? = E(X?) — [E(X)? = np(t — p) 


Let X be a Poisson r.v. with parameter A. 

(a) Find the moment generating function of X. 
(b) Find the mean and variance of X. 

(a) By definition (4.40) and Eq. (2.40), 


mw 
=e" = g Agit = pilet- 1) 


(b) The first two derivatives of M ,(t) are 
My{t) = dete 9) 
Mi(t) = (deere 4 deteter"U) 
Thus, by Eq. (4.42), 
E(X) = Mx(0) =A E(X?) = Mx(0) =A? +4 
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(4.115) 


(4.116) 


(4.117) 


152 FUNCTIONS OF RANDOM VARIABLES, EXPECTATION, LIMIT THEOREMS [CHAP. 4 


Hence, Var(X) = E(X*) -{E(X)P =v? +A-V=A 


4.48. Let X be an exponential r.v. with parameter A. 


(a) Find the moment generating function of X. 
(b} Find the mean and variance of X. 
(a) By definition (4.40) and Eq. (2.48), 


M,({t) = Ele*) = | de **el* dx 


1) 


A ° A 
= (i A)x = — 
—e er A>t (4.118) 
(b) The first two derivatives of M,(t) are 
A 2 
Mi(t) = —— “(t) =< ——— 
MO= Gop MO =O 
Thus, by Eq. (4.42), 
I 2 
E(X) = M,(0) => E(X*) = Mx(0) = 3 
2 syqn 2 1? 1 
Hence, Var(X) = E(X*) — LE(X)]* = p-A\G) 7B 


4.49. Find the moment generating function of the standard normal r.v. X = N(O; 1) and calculate the 
first three moments of X. 
By definition (4.40) and Eq. (2.52), 


ow 
=x2 
e* 12 gtx dx 


M(t) = E(e'*) = | 


a 2n 


Combining the exponents and completing the square, that is, 


x? 7 (x—1? #7 
os x=- — 
2 2 2 
we obtain 
yf | 
M ,(t) = el? | Ti eT EO DH2 dy = pt/2 (4.119) 
—~% Te 


since the integrand is the pdf of N(t; 1). 
Differentiating M ,(t) with respect to ¢ three times, we have 


Mj(t) = te?/? M(t) = (t? + 1e"/? M(t) = (8 + Bee? 
Thus, by Eq. (4.42), 
E(X) = Mi(0) =0 E(X?) = Mx(0) = 1 E(X3) = M,(0) = 0 


4.50. Let Y =aX +b. Let M,(t) be the moment generating function of X. Show that the moment 
generating function of Y is given by 


My(t) = eM ,(at) (4.120) 
By Eqs. (4.40) and (4.105), 
My(t) = Ele) = Efe***] 
= e E(e*"*) = eM (at) 
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4.51. 


4,52. 


4,53. 


4.54, 


4.55. 


4.56. 


Find the moment generating function of a normal r.v. N(; 07). 


If X is N(Q; 1), then from Prob. 4.1 (or Prob, 4.37), we see that Y = oX + pis N(u; 07). Then by setting 
a=aand b = pin Eq. (4.120) (Prob. 4.50) and using Eq. (4.119), we get 


My(t) = eM, (ot) = ette@07/2 = ext to70272 (4.121) 


Let X,, ..., X, be n independent r.v.’s and let the moment generating function of X; be My (vt). 
Let Y = X, +--+: 4+ X,. Find the moment generating function of Y. 


By definition (4.40), 


My(t) = Efe!) = Elet*er ot 80] = Fett ... eX 
= E(e')--. E(e'*)_ (independence) 
= My() °° My (0) (4.122) 


Show that if X,,..., X, are independent Bernoulli r.v.’s with the parameter p, then Y = X, + 
+++ 4X, is a binomial r.v. with the parameters (n, p). 


Using Eqs. (4.122) and (4.115), the moment generating function of Y is 


n 


Molt) = V1 (a + pe) = (q + pe'y’ q=1-p 


which is the moment generating function of a binomial r.v. with parameters (n, p) (Eq. (4./16)]. Hence, Y is 
a binomial r.v, with parameters (n, p). 


Show that if X,, ..., X, are independent Poisson r.v.’s X; having parameter A;, then Y = X, + 
- + X,, is also a Poisson r.v. with parameter A= A, +-+- +4,. 


Using Eqs. (4.122) and (4.117), the moment generating function of Y is 


nn 
My(t) = Il eile 0 & ofEaer— gle 1) 


i= 
which is the moment generating function of a Poisson r.v. with parameter J. Hence, Y is a Poisson r.v. with 
parameter A = LA, =A, +°°-+A,. 
Note that Prob. 4.15 is a special case for n = 2. 


Show that if X,,..., X, are independent normal r.v.’s and X; = N(u;; 6,7), then Y =X, + 
- + X,, is also a normal r.v. with mean wp = p, +++: + yu, and variance o? = 0,7 +-:- +4," 


Using Eqs. (4.122) and (4.121), the moment generating function of Y is 


Mp) = I plait + a2 /2) _. o(Emile + (Zoi2)02/Z put + 0282/2 


i=) 


which is the moment generating function of a normal r.v. with mean p and variance o?. Hence, Y is a 
normal r.v. with mean p = w, +++: + 4, and variance o? = 4,7 +---+4,7 
Note that Prob. 4.18 is a special case for n = 2 with y, = O and o? = 


Find the moment generating function of a gamma r.v. Y with parameters (n, 4). 


From Prob. 4.33, we see that if X,,..., X, are independent exponential r.v.’s, each with parameter A, 
then Y = X, +--:+X, is a gamma rv. with parameters (n, A), Thus, by Eqs. (4.122) and (4.118), the 


moment generating function of Y is 
” a Aa \ 
M(t) = ——}=|>-— 4.123 
a ns) (4) (4123) 
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CHARACTERISTIC FUNCTIONS 


457. The rv. X can take on the values x, = —1 and x, = +1 with pmfs p,(x,) = py(x2) = 0.5. 
Determine the characteristic function of X. 
By definition (4.50), the characteristic function of X is 


Y »(@) = 0.5e7#° + 0.5e!" = Hel + e %) = cos w 


4.58. Find the characteristic function of a Cauchy r.v. X with parameter a and pdf given by 


a 


= —-e<xXx< 
n(x? + a?) 


Ix(x) 


By direct integration (or from the Table of Fourier transforms in Appendix B), we have the following 
Fourier transform pair: 
2a 
w* + a? 


e all ed 


Now, by the duality property of the Fourier transform, we have the following Fourier transform pair: 


2a 
x+a 


oe Aneel = Qne ale 
2 


or (by the linearity property of the Fourier transform) 


a yan e alel 
n(x? + a?) 
Thus, the characteristic function of X is 
Y (a) = e ale! (4.124) 


Note that the moment generating function of the Cauchy r.v. X does not exist, since E(X") > oo for n > 2. 


4.59. The characteristic function of a r.v. X is given by 
1-Jo| lo|<1 
y = 
(@) ‘6 |w|>1 
Find the pdf of X. 


From formula (4.51), we obtain the pdf of X as 


oD 


= ! yw — fox di 
fx) = 5 _ x(@)e @ 


0 1 
ail (1 + wei io + [ (1 — w)e"* ia | 
an LJ-1 0 


; . 1 
(2 — e* — e *) = —> (1 — cos x) 


nx? nx? 
_ 1 [sin(x/2) ? wexe 
Qn x/2 vee 


4.60. Find the characteristic function of a normal r.v. X = N(u; @?). 


The moment generating function of N(u; a7) is [Eq. (4.121)] 


M(t) = ett 970772 
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Thus the characteristic function of N(u; 07) is obtained by setting ¢ = jw in M,(t); that is, 


¥ (wo) = ght tat22 = efon ~ a%wt/2 (4.125) 


f= jw 


4.61. Let Y = aX + b. Show that if ‘¥y(q) is the characteristic function of X, then the characteristic 
function of Y is given by 


Ww) = ef?Y (aw) (4.126) 
By definition (4.50), 


(wo) = Ele) = E[eioiex +) 
= el E(ei0e*) = ei (aw) 


4.62. Using the characteristic equation technique, redo part (b) of Prob. 4.16. 
Let Z = X + Y, where X and Y are independent. Then 
¥e(co) = Eleie?) = B(ei#®*Y) = Ble*)E(e!”) 
= PF (o)¥,(@) (4.127) 
Applying the convolution theorem of the Fourier transform (Appendix B), we obtain 


fz) = F '{¥{@)] = FTP do)¥ fo)] 
= fy(z) * f(z) = { SLO LA(Z — x) ax 


THE LAWS OF LARGE NUMBERS AND THE CENTRAL LIMIT THEOREM 
4.63. Verify the weak law of large numbers (4.58); that is, 
lim P(|X,-—pl|>e)=0  foranye 


n>+owo 


. i 
where X, =—(X, +++ + X,) and E(X,) = p, Var(X)) = 0. 


Using Eqs. (4./08) and (4.112), we have 


E(X,)= and Var(X,) = — (4.128) 
Then it follows from Chebyshev’s inequality [Eq. (2.97)] (Prob. 2.36) that 
> a? 
PX, —ul> a) S—5 (4.129) 


Since lim, _. , 77/(ne*) = 0, we get 


no 


lim P({|X, — pl >) = 0 


no 


4.64. Let X be ar.v. with pdf f,(x) and let X,,..., X, be a set of independent r.v.’s each with pdf fy(x). 
Then the set of r.v.’s X,, ..., X, is called a random sample of size n of X. The sample mean is 
defined by 


1 a 
n > X; (4.130) 


i=1 


1 
Kya (Xi to + Xi) = 
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Let X,, ..., X, be a random sample of X with mean p and variance o?. How many 
samples of X should be taken if the probability that the sample mean will not deviate from the 
true mean yp by more than g/10 is at least 0.95? 


Setting ¢ = ¢/10 in Eq. (4.129), we have 


_ o _ a o? 100 
PU|X,— —)=1-—P(|X¥,-pl<—)<—— ~~ = — 
( a n>) ( a—hIS )s oy , 


_ l 
or (ix, nis Z)21-™ 
10 n 


Thus if we want this probability to be at least 0.95, we must have 100/n < 0.05 or n = 100/0.05 = 2000. 


Verify the central limit theorem (4.61). 


Let X,,...,X, be a sequence of independent, identically distributed r.v’s with E(X,) =p and 
Var(X ,) = 07. Consider the sum S, = X, +--+ +X,. Then by Eqs. (4.108) and (4.112), we have E(S,) = nu 
and Var(S,) = no?. Let 


z,- om ts (=) (4.131) 


Then by Eqs. (4.105) and (4.106), we have E(Z,) = 0 and Var(Z,) = 1. Let M(t) be the moment generating 
function of the standardized r.v. Y, = (X; — p)/o. Since E(Y) = 0 and E(¥,?) = Var(Y,) = 1, by Eq. (4.42), we 
have 


M(0)=1 M0) = E(¥) = 0 M"(0) = E(¥?) = 1 


Given that M'(1) and M"(t) are continuous functions of t, a Taylor (or Maclaurin) expansion of M(t) about 
t = 0 can be expressed as 


t? ? 
M(t) = M(O) + M‘(O)t + M“(t,) 77 1+ M%(t,) z O<i,<t 
By adding and subtracting 7/2, we have 
M(t)=1 +422 +40M%(t,) — 10? (4.132) 
Now, by Eqs. (4.120) and (4.122), the moment generating function of Z, is 


M2(t) = IM(=)] (4.133) 


Using Eq. (4.132), Eq. (4.133) can be written as 


maid =[1+5(S) +pomen— a) | 
2At) = + > Jn + 3 (t4) - Vn 


where now f, is between 0 and t/,/n. Since M"(t) is continuous at ¢ = 0 and t, > 0 as n> oo, we have 


lim [M“(t,) —1] = M0) -—1=1—1=0 


no 


Thus, from elementary calculus, lim,..,,(1 + x/n)" = e*, and we obtain 


2 ] a 
lim Mz (t) = lim \ $ie+e [M"(t,) — ue} 
” 2n  2n 


n-* a noo 


2 2" 
lim (1 + ed = el 


n+o@ n 


The right-hand side is the moment generating function of the standard normal r.v. Z = N(O; 1) [Eq. 
(4.119)]. Hence, by Lemma 4.2 of the moment generating function, 


lim Z, = N(0; 1) 


nro 
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4.66. 


4.67. 


Let X,,..., X,, be n independent Cauchy r.v.’s with identical pdf shown in Prob, 4.58. Let 


LX; 
n=1 


Y, = 


sie 


W(X, b + X,)= 
(a) Find the characteristic function of Y,. 
(b) Find the pdf of Y,. 
(c) Does the central! limit theorem hold? 
(a) From Eq. (4.124), the characteristic function of X;, is 
W x (co) = e721! 
Let Y = X,+---+X,,. Then the characteristic function of Y is 
Wo) = Ele") = Ble te +40] = TL Wyle) = er (4.134) 
i= 


Now Y, = (t/n)Y. Thus, by Eq, (4.126), the characteristic function of Y, is 
@ 
¥y,(@) = ¥(2) = eo ralonl — male! (4.135) 
n 


(b) Equation (4./35) indicates that Y, is also a Cauchy r.v. with parameter a, and its pdf is the same as that 
of X;. 

(c) Since the characteristic function of Y, is independent of n and so is its pdf, ¥, does not tend to a normal 
r.v. aS n> 06, and So the central limit theorem does not hold in this case. 


Let Y be a binomial r.v. with parameters (n, p). Using the central limit theorem, derive the 
approximation formula 


PUY <y)& (2) (4.136) 


np — p) 
where ®(z) is the cdf of a standard normal r.v. [Eq. (2.54)]. 


We saw in Prob. 4.53 that if X,, ..., X, are independent Bernoulli r.v.’s, each with parameter p, then 
Y =X, +-:: +X, is a binomial r.v. with parameters (n, p). Since X;’s are independent, we can apply the 
central limit theorem to the r.v. Z, defined by 


12 x) 1 (Fs 
Z,=—> (A) =— —————— (4.137) 
Jn 4, Var(X;) /n 4, p(l — p) 
Thus, for large n, Z, is normally distributed and 
P(Z,, < x) & D(x) (4.138) 


Substituting Eq. (4.137) into Eq. (4.138) gives 


arp (Ema) sx] rer soy 
P| ——— (X;-— p)] <x |= PLY < x/np(l — p) + np] » P(x) 
| np — p) », 


—n 
or PUY <y)x (7) 
v np(l — p) 
Because we are approximating a discrete distribution by a continuous one, a slightly better approx- 


imation is given by 


1 — 
PUY <y)x o( 422) (4.139) 


</ np(l — p} 


Formula (4.139) is referred to as a continuity correction of Eq. (4.136). 
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4.68. 


4.69. 


4,70. 


4.71. 
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Let Y be a Poisson r.v. with parameter 4. Using the central limit theorem, derive approximation 
formula: 


—A 
PY <y)® (2) (4.140) 
We saw in Prob. 4.54 that if X,, ..., X, are independent Poisson r.v.’s X; having parameter A,, then 
Y = X,+---+X, is also a Poisson r.v. with parameter A = A, + ++: + A,. Using this fact, we can view a 
Poisson r.v. Y with parameter A as a sum of independent Poisson r.v.’s X;, i = l,..., n, each with parameter 


A/n; that is, 


The central limit theorem then implies that the r.v. Z defined by 
Y-E(Y) Y-A 
7 Yaa) _¥-A 


= (4.141) 
JVatY) Sa 
is approximately normal and 
P(Z < 2) & D(z) (4.142) 
Substituting Eq. (4.141) into Eq. (4.142) gives 
Y-A 
of WA <:)=PYs Az + A) x Pz) 
—A 
or P(Y <y)® (XZ ) 
Again, using a continuity correction, a slightly better approximation is given by 
5A 
PLY < yx (i) (4.143) 


Supplementary Problems 


Let Y = 2X + 3. Find the pdf of Y if X is a uniform r.v. over (— 1, 2). 
: l<y<7 
Ans. fyly) = : , 


0 otherwise 


Let X be a r.v. with pdf f,(x). Let Y = |X|. Find the pdf of Y in terms of f,(x). 


SO) +h(-y) y>od 


Ans, sor= 4 y<0 


Let Y = sin X, where X is uniformly distributed over (0, 2). Find the pdf of Y. 
1 


Ans. fly) = \n/1 — y? 
0 otherwise 


-l<y<l 
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4.72. Let X and Y be independent r.v.’s, each uniformly distributed over (0, 1). Let Z= X+Y,W=xX—-/Y, 
Find the marginal pdfs of Z and W. 


Zz O<z<1 w+] -l<w<0 
Ans. f{z)=\—-z+2 1<z<2 felw = S-w4 l O0<w<il 
0 otherwise 0 otherwise 


4.73. Let X and Y be independent exponential r.v.’s with parameters « and 8, respectively. Find the pdf of 
(a) Z = X — Y;(b) Z = X/Y3(c) Z = max(X, Y): (ad) Z = min(X, Y). 


a8 e z>0 ap +0 
—— z 
Ans. (a) f(z) = af (b) f(z) = \(az + BY 
eft z<0 0 z<0 
at 
_ faew™(1 —e7F) + Be F(1 — ee *) z>0 
(c) fplz) = ‘o z<0 
_ a+ Bye~@* Pe z>0 


4.74. Let X denote the number of heads obtained when three independent tossings of a fair coin are made. Let 
Y = X?, Find E(Y). 


Ans. 3 


4.75. Let X be a uniform r.v. over (— 1, 1). Let Y = X". 


(a) Calculate the covariance of X and Y. 
(b) Calculate the correlation coefficient of X and Y. 


J 3(2n + 1 
Ans. (a) CovX,Y)=4ne2 "7 004 (6) pyy = eS n= odd 
0 n= even 0 n= even 
4.76. Let the moment generating function of a discrete r.v. X be given by 
M(t) = 0.25e' + 0.35e7' + 0.40e7! 
Find P(X = 3). 
Ans. 0.35 


4.77. Let X be a geometric r.v. with parameter p. 


(a) Determine the moment generating function of X. 
(b) Find the mean of X for p = 3. 


f 


Ans. (a) Mi) = 7a t<~—Ingq=!—p (b) E(X) = 


Nie 


4.78. Let X be a uniform rv. over (a, b). 


(a) Determine the moment generating function of X. 
(b) Using the result of (a), find E(X), E(X?), and E(X°). 


tb e* 


Ans. (a) Ml) = a 


(b) E(X) = 4(b + a), E(X?) = 4(b? + ab + a?) E(X3) = 4(b? + b?a + ba? + a) 


160 


4.79. 


4.80. 


481. 


4.82. 


4.83. 


4.84. 


485. 


4.86. 


4.87. 
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Consider a r.v..X with pdf 


J 
VJ 328 


~ x + 772/32 


fxlx) = -w<x< & 


é 


Find the moment generating function of X. 


Ans. M,(t) = e777 8? 


Let ¥ = N(O; 1). Using the moment generating function of X, determine F(X”). 


4 EX") = 0 n=1,3.5,... 
ns. ee eee nD n=2,4,6,... 


Let X and Y be independent binomial r.v.’s with parameters (n, p) and (m, p), respectively. Let Z = X + Y. 
What is the distribution of Z? 
Hint: Use the moment generating functions. 


Ans. Z isa binomial r.v. with parameters (n + m, p). 


Let (X, Y) be a continuous bivariate r.v. with joint pdf 


ein x>Oy>0 


fryls y) = 


0 otherwise 


(a) Find the joint moment function of X and Y. 


(b) Find the joint moments mj, M91, and m,,. 


! 
Ans. (a) Milt 1) = ay (b) myo = 1, mo, = I,m, = 1 


Let (X, Y) be a bivariate normal r.v. defined by Eq. (3.88). Find the joint moment generating function of X 
and Y. 


Ans, Myy(ty, ty) = ete Quy Ui2ax? + 2rirzexayp = 1220722 


Let X,,..., X, be n independent r.v.’s and X, > 0. Let 


Show that for large n, the pdf of Y is approximately log-normal. 

Hint: Take the natural logarithm of Y and use the central limit theorem and the result of Prob. 4.10. 

Let Y =(X ~— AJ. where X is a Poisson r.v. with parameter 4. Show that Y ~ N(O; 1) when 2 is suffi- 
ciently large. 

Hint: Find the moment generating function of Y and let 4 > oc. 

Consider an experiment of tossing a fair coin 1000 times. Find the probability of obtaining more that 520 
heads (a) by using formula (4./36), and (6) by formula (4./39). 

Ans. (a) 0.1038 (b) 0.0974 

The number of cars entering a parking lot is Poisson distributed with a rate of 100 cars per hour. Find the 


time required for more than 200 cars to have entered the parking lot with probability 0.90 (a) by using 
formula (4./40), and (b) by formula (4.143), 


Ans. (a) 2.189h (b) 2.1946 h 


Chapter 5 


Random Processes 


5.1 INTRODUCTION 


In this chapter, we introduce the concept of a random (or stochastic) process. The theory of 
random processes was first developed in connection with the study of fluctuations and noise in physi- 
cal systems. A random process is the mathematical model of an empirical process whose development 
is governed by probability laws. Random processes provides useful models for the studies of such 
diverse fields as statistical physics, communication and control, time series analysis, population 
growth, and management sciences. 


5.2 RANDOM PROCESSES 
1. Defintion: 


A random process is a family of r.v.’s {X(t), t€ T} defined on a given probability space, indexed 
by the parameter ¢, where t varies over an index set T. 

Recall that a random variable is a function defined on the sample space S (Sec. 2.2). Thus, a 
random process {X(t), t € T} is really a function of two arguments {X(t, 0), te T, €¢ e S}. Fora fixed 
t(=t,), X(t,, O = X,{0) is a rv. denoted by X(t,), as ¢ varies over the sample space S. On the other 
hand, for a fixed sample point ¢; € S, X(t, €) = X(t) is a single function of time t, called a sample 
function or a realization of the process. The totality of all sample functions is called an ensemble. 

Of course if both € and t are fixed, X(t,, ¢;) is simply a real number. In the following we use the 
notation X(t) to represent X(z, ¢). 


B. Description of a Random Process: 


In a random process {X(t), 1 € T}, the index set T is called the parameter set of the random 
process. The values assumed by X(t) are called states, and the set of all possible values forms the state 
space E of the random process. If the index set T of a random process is discrete, then the process is 
called a discrete-parameter (or discrete-time) process. A discrete-parameter process is also called a 
random sequence and is denoted by {X,, n = 1, 2, ...}. If T is continuous, then we have a continuous- 
parameter (or continuous-time) process. If the state space E of a random process is discrete, then the 
process is called a discrete-state process, often referred to as a chain. In this case, the state space E is 
often assumed to be {0, J, 2, ...}. If the state space E is continuous, then we have a continuous-stale 
process. 

A complex random process X(t) is defined by 


X(t) = X ,(t) + jX 2(t) 


where X ,(t) and X(t) are (real) random processes and j = Jr. Throughout this book, all random 
processes are real random processes unless specified otherwise. 


5.3 CHARACTERIZATION OF RANDOM PROCESSES 
A. Probabilistic Descriptions: 


Consider a random process X(t). For a fixed time t,, X(t,) = X, is ar.v., and its cdf Fy(x,3 t,) is 
defined as 


Fy(x15 ()) = P{X(ty) < xy} (5.1) 
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Fy(x,; t,) is known as the first-order distribution of X(t). Similarly, given t, and t,, X(t,) = X, and 
X(t,) = X, represent two r.v.’s. Their joint distribution is known as the second-order distribution of 
X(t) and is given by 


F y(xq, X25 ty, €2) = P{X (ty) S x4, X(t2) < x3} (5.2) 
In general, we define the nth-order distribution of X(t) by 
Fy(Xq, 065 Xqh bys ee bn) = PEX(t:) SX, ..-, X(t) S Xn} (5.3) 
If X(t) is a discrete-time process, then X(t) is specified by a collection of pmf’s: 
DylX yy 00) Xpp bys eer Ge) = PEX(ty) = xy, ..-, X(t,) = x, } (5.4) 
If X(t) is a continuous-time process, then X(t) is specified by a collection of pdf's: 
S(Xpy oes Xp bay very by) = Oa i vos ty) (5.5) 


The complete characterization of X(t) requires knowledge of all the distributions as n> co. Fortu- 
nately, often much less is sufficient. 
B. Mean, Correlation, and Covariance Functions: 


As in the case of r.v.’s, random processes are often described by using statistical averages. 
The mean of X(t) 1s defined by 


By(t) = ELX(t)] (5.6) 


where X(t) is treated as a random variable for a fixed value of t. In general, u,(t) is a function of time, 
and it is often called the ensemble average of X(t). A measure of dependence among the r.v.’s of X(t) is 
provided by its autocorrelation function, defined by 


Ry(t, s) = ELX(t)X(s)] (5.7) 

Note that 
R(t, s) = Ry(s, t) (5.8) 
and R,(t, t) = ELX*(0)] (5.9) 


The autocovariance function of X(t) is defined by 
K x(t, s) = Cov[X(0), X(s)] = E{LX() — px(QILX(8) — ux(s)]} 
= Ryx(t, 8) — wy(t)ex(s) (5.10) 
It is clear that if the mean of X(t) is zero, then K,(t, s) = R(t, s). Note that the variance of X(t) is 
given by 
y(t) = Var[X(t)] = E{LX() — wy(t))?} = Kale, 2) (5.11) 


If X(t) is a complex random process, then its autocorrelation function R,(t, s) and autocovariance 
function K y(t, s) are defined, respectively, by 


R,(t, 8) = ELX(t)X*(s)] (5.12) 
and K x(t, s) = E{LX(t) — wx(t)]LX(s) — ex(s)]*} (5.13) 
where * denotes the complex conjugate. 


5.4 CLASSIFICATION OF RANDOM PROCESSES 


If a random process X(t) possesses some special probabilistic structure, we can specify less to 
characterize X(t) completely. Some simple random processes are characterized completely by only the 
first- and second-order distributions. 
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A. Stationary Processes: 


A random process {X(t), t € T} is said to be stationary or strict-sense stationary if, for all n and 
for every set of time instants (t, € T, i= 1, 2,...,n}, 
FyXq5 0005 Mab bay eee by) = Fy(Xq, ---, Ma ty #O,---, o, # 1) (5.14) 


for any t. Hence, the distribution of a stationary process will be unaffected by a shift in the time 
origin, and X(t) and X(t +t) will have the same distributions for any t. Thus, for the first-order 
distribution, 


Fy(x; t) = Fy{x; t+ 1) = Fy{x) (5.15) 
and Sides 2) = fix) (5.16) 
Then Ly(t) = ELX()] = p (5.17) 
VarlX(t)] = 0? (5.18) 

where yu and o? are contants. Similarly, for the second-order distribution, 
FyxX1, X23 C1, £2) = Fyfx1, X25 by — ¢) (5.19) 
and Sel% 15 X25 bay te) =Sxe1, X23 2 — £4) (5.20) 

Nonstationary processes are characterized by distributions depending on the points ¢,, t2,...,t,. 


B. Wide-Sense Stationary Processes: 


If stationary condition (5.14) of a random process X(t) does not hold for all n but holds for n < k, 
then we say that the process X(t) is stationary to order k. If X(t) is stationary to order 2, then X(t) is 
said to be wide-sense stationary (WSS) or weak stationary. If X(t) is a WSS random process, then we 
have 


1. ELX(d)] = u (constant) (5.21) 
2. Rylt, s) = ELX(t)X(s)] = Ry(1s — €]) (5.22) 


Note that a strict-sense stationary process is also a WSS process, but, in general, the converse is not 
true. 


C. Independent Processes: 


In a random process X(t), if X(t;) for i = 1, 2,..., n are independent r.v.’s, so that for n = 2, 3,..., 


Fy{Xqy 000) Xyi bys eee ty) = [] Fete: ti) (5.23) 
i=1 


then we call X(t) an independent random process. Thus, a first-order distribution is sufficient to charac- 
terize an independent random process X(t). 


D. Processes with Stationary Independent Increments: 


A random process {X(t), t 2 0) is said to have independent increments if whenever 0 <t, <t, < 
tty, 


X(0), X(t1) — X(0), X(t2) — X(t1), «+, X(tn) — X(ta-1) 
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are independent. If (X(t), t 2 0) has independent increments and X(t) — X(s) has the same distribu- 
tion as X(t + h) — X(s +A) for all s, t, A > 0, s <t, then the process X(t) is said to have stationary 
independent increments. 

Let {X(t), t = 0} be a random process with stationary independent increments and assume that 
X(0) = 0. Then (Probs. 5.21 and 5.22) 


E(X(t)] = wit (5.24) 

where np, = E{X(1)] and 
Var[X()] = ¢,7t (5.25) 

where o,7 = Var[X(1)]. 
From Eq. (5.24), we see that processes with stationary independent increments are nonstationary. 


Examples of processes with stationary independent increments are Poisson processes and Wiener 
processes, which are discussed in later sections. 


E. Markov Processes: 
A random process {X(t), t € T} is said to be a Markov process if 
PEX (tas) < Xn+t | X(t,) = Xs X(t) = X2, srhy X(t,) = Xn} = P{X (ths 1) s Xntd | X(t,) = Xn} (5.26) 


whenever t; <t2<<'' <t, <tye4. 
A discrete-state Markov process is called a Markov chain. For a discrete-parameter Markov 
chain {X,,, n > 0} (see Sec. 5.5), we have for every n 


PUX ass =Jl[Xo =i9, Xi, =i, ...,X, =) = PUXy+4 =j|X, =) (5.27) 


Equation (5.26) or Eq. (5.27) is referred to as the Markov property (which is also known as the 
memoryless property). This property of a Markov process states that the future state of the process 
depends only on the present state and not on the past history. Clearly, any process with independent 
increments is a Markov process. 

Using the Markov property, the nth-order distribution of a Markov process X(t) can be 
expressed as (Prob. 5.25) 


Py(Xq. ey Xn bp ees Oh) = F(X 5 ty) [] PLX (ty) S Xu} | X(te— 1) = Xe-a} (5.28) 
k=2 


Thus, all finite-order distributions of a Markov process can be expressed in terms of the second-order 
distributions. 


F. Normal Processes: 


A random process {X(t), £€ T} is said to be a normal (or gaussian) process if for any integer n 
and any subset {t,,...,¢,} of T, the n r.v.’s X(t,),..., X(t,) are jointly normally distributed in the 
sense that their joint characteristic function is given by 


Y xiray Xt@iy +> On) = Efexp jlo, X(t,) +++ +a, X(t,)]} 


: nt 1 n nt . 

= exp Y OELX(— 5 YY 'a,@, CovlX(ty, xt} (5.29) 
i=1 i=1 k=1 

where @,, ..., @, are any real numbers (see Probs, 5.59 and 5.60). Equation (5.29) shows that a 

normal process is completely characterized by the second-order distributions. Thus, if a normal 

process is wide-sense stationary, then it is also strictly stationary. 
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G. Ergodic Processes: 


Consider a random process {X(t), —00 <t < 00} with a typical sample function x(t). The time 
average of x(t) is defined as 


1 T/2 
€x(t)> = lim al x(t) dt (5.30) 


T?o -T/2 


Similarly, the time autocorrelation function R,(t) of x(t) is defined as 


- . 1 ft? 

Ry(t) = Cx(t)x(t + t)> = lim 7 | x(t)x(t + t) dt (5.31) 
T?a@ —T/2 

A random process is said to be ergodic if it has the property that the time averages of sample 

functions of the process are equal to the corresponding statistical or ensemble averages. The subject 

of ergodicity is extremely complicated. However, in most physical applications, it is assumed that 

stationary processes are ergodic. 


5.55 DISCRETE-PARAMETER MARKOV CHAINS 


In this section we treat a discrete-parameter Markov chain {X,, n> 0} with a discrete state 
space E = {0, 1, 2,...}, where this set may be finite or infinite. If X, = i, then the Markov chain is 
said to be in state i at time n (or the nth step). A discrete-parameter Markov chain {X,, n > 0} is 
characterized by [Eq. (5.27)] 


P(Xna1 =JIXo = ig, X, =, X= = PUK HII XL =D (5.32) 


where P{x,4, =j|X, =i} are known as one-step transition probabilities. If P{x,4, =j|X, = i} is 
independent of n, then the Markov chain is said to possess stationary transition probabilities and the 
process is referred to as a homogeneous Markov chain. Otherwise the process is known as a nonhomo- 
geneous Markov chain. Note that the concepts of a Markov chain’s having stationary transition 
probabilities and being a stationary random process should not be confused. The Markov process, in 
general, is not stationary. We shall consider only homogeneous Markov chains in this section. 


A. Transition Probability Matrix: 


Let {X,, n> 0} be a homogeneous Markov chain with a discrete infinite state space E = {0, 1, 
2, ...}. Then 


pis = P(X 40, = JX, =i i20,j20 (5.33) 


regardless of the value of n. A transition probability matrix of {X,,n > 0} is defined by 


Poo Pos Po2 
P= [pis] —| P10 Pas Pra 
Pro Par P22 
where the elements satisfy 
p20 > Py = 1 i= 0, 1, 2, wae (5.34) 
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In the case where the state space E is finite and equal to {1, 2,..., m}, P is m x m dimensional; that is, 
Pix Paz "Pim 
P=[pj)=|?21 P22 "Pam 
ml Poa ot Pram 
m 
where pi = 9 > pi = 1 i=1,2,...,m (5.35) 


A square matrix whose elements satisfy Eq. (5.34) or (5.35) is called a Markov matrix or stochastic 
matrix. 


B. Higher-Order Transition Probabilities—Chapman-Kolmogorov Equation: 


Tractability of Markov chain models is based on the fact that the probability distribution of 
{X,,, 7 = 0} can be computed by matrix manipulations. 

Let P = [p,;) be the transition probability matrix of a Markov chain {X,, n > 0}. Matrix powers 
of P are defined by 


P? = Pp 
with the (i, jth element given by 


Note that when the state space E is infinite, the series above converges, since by Eq. (5.34), 
D, Pix Pri SD Pn = | 
k k 
Similarly, P? = PP? has the (i, /)th element 
py = » Dix Pej 


and in general, P"*! = PP" has the (i, /)th element 
i (5.36) 
k 


Finally, we define P° = I, where J is the identity matrix. 
The n-step transition probabilities for the homogeneous Markov chain {X,, n >} are defined 
by 


P(X, = j| Xo =) 
Then we can show that (Prob. 5.70) 
pi” = P(X, =j|Xo= i) (5.37) 


We compute p;,”” by taking matrix powers. 
The matrix identity 


pr+™ — prp™ n,m>0 
when written in terms of elements 


pat => pap, (5.38) 
k 
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is known as the Chapman-Kolmogorov equation. It expresses the fact that a transition from i to j in 
n +m steps can be achieved by moving from i to an intermediate k in n steps (with probability p,,"”), 
and then proceeding to j from k in m steps (with probability p,,’”). Furthermore, the events “go from i 
to k in n steps” and “go from k to j in m steps” are independent. Hence the probability of the 
transition from / to j in n+ m steps via i, k, j is py,p,;”’. Finally, the probability of the transition 
from i toj is obtained by summing over the intermediate state k. 


C. The Probability Distribution of {X, , 1 > 0}: 


Let p(n) = P(X, = i) and 
p(n) = [po(n) pi(n) p2(n) -*-] 


where > p,(n) = 1 
k 


Then p(0) = P(X = i) are the initial-state probabilities, 


POO) = [p(9) ps0) pa(0) +--+) 
is called the initial-state probability vector, and p(n) is called the state probability vector after n tran- 
sitions or the probability distribution of X,,. Now it can be shown that (Prob. 5.29) 
p(n) = p(O)P” (5.39) 


which indicates that the probability distribution of a homogeneous Markov chain is completely 
determined by the one-step transition probability matrix P and the initial-state probability vector 


p(0). 


D. Classification of States: 


1. 


N 


Accessible States: 


State j is said to be accessible from state i if for some n > 0, p;” > 0, and we write i> j. Two 
States i and j accessible to each other are said to communicate, and we write i<j. If all states commu- 
nicate with each other, then we say that the Markov chain is irreducible. 


Recurrent States: 
Let T, be the time (or the number of steps) of the first visit to state j after time zero, unless state j 
is never visited, in which case we set T; = 00. Then T, is a discrete r.v. taking values in {1, 2,..., oo}. 
Let 

fiji = PUT; = m|Xo =) = PX, =), Xy AIK =1,2,...,m—1| XQ = 3) (5.40) 

and f,‘° = 0 since T, > 1. Then 
Sf? = P(T, =1| Xo =) = P(X, =j|Xo =i) = diy (5.41) 
and Fi = Y pa fi? m= 2,3,... (5.42) 

k#j 


The probability of visiting j in finite time, starting from i, is given by 


ii = ¥ f= PUT, < 1X =3) (5.43) 


n=0 
Now state j is said to be recurrent if 
Sy = P(T, < ©|X9 =j=1 (5.44) 
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That is, starting from j, the probability of eventual return to j is one. A recurrent state j is said to be 
positive recurrent if 


E(T;|Xo =f) < «© (5.45) 
and state j is said to be null recurrent if 
E(T;|Xo =j) = © (5.46) 
Note that 
E(T,|Xo =j)= ¥ nf,” (5.47) 
n=0 


3. Transient States: 
State j is said to be transient (or nonrecurrent) if 
fp = PT < ©|X9=/)<1 (5.48) 
In this case there is positive probability of never returning to state j. 
4. Periodic and Aperiodic States: 
We define the period of state j to be 
d(j) = ged{n > 1: p,” > 0} 


where gcd stands for greatest common divisor. 
If d(j) > 1, then state j is called periodic with period d(j). If dj) = 1, then state j is called aperiodic. 
Note that whenever p,; > 0, j is aperiodic. 


5. Absorbing States: 


State j is said to be an absorbing state if p,, = 1; that is, once state j is reached, it is never left. 


E. Absorption Probabilities: 


Consider a Markov chain X(n) = {X,,, n > 0} with finite state space E = {1, 2,..., N} and tran- 
sition probability matrix P. Let A = {1,..., m} be the set of absorbing states and B = {m+ 1,..., N} 
be a set of nonabsorbing states. Then the transition probability matrix P can be expressed as 


I 0 0 0 0 
0 1 0 0 0 
: Doin ; : Mn : 10 
P= 0 sone ] 0 tee 0 -|f a1 (5.49a) 
Pm+1.1 . ne Pm+i,m Pm+i,m+1 ut Pm+i,N 
PN Sots PNym PN, m+1 “ot PNLN 
where | is an m x m identity matrix, O is an m x (N — m) zero matrix, and 
Pm+t, 1 ue Pmt+i,m Pmti,mt1 see Pm+i,N 
R= a : Q= : mn : (5.49) 
Pn.1 “Tt PN. m PN,m+4 ‘7 PNAN 


Note that the elements of R are the one-step transition probabilities from nonabsorbing to absorbing 
states, and the elements of Q are the one-step transition probabilities among the nonabsorbing states. 
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Let U = [u,;], where 
Uj = P{X, = jE A)| Xo = kl B)} 


It is seen that U is an (N — m) x m matrix and its elements are the absorption probabilities for the 
various absorbing states. Then it can be shown that (Prob. 5.40) 


U=(1—Q)'R=OR (5.50) 
The matrix ® = (J — Q)~* is known as the fundamental matrix of the Markov chain X(n). Let T, 
denote the total time units (or steps) to absorption from state k. Let 

T= [T+ Tnt20°°° Ty] 
Then it can be shown that (Prob. 5.74) 
N 
ET) = Y dei k=m+1,...,N (5.51) 
i=mt1 


where @,; is the (k, i)th element of the fundamental matrix ©. 


F. Stationary Distributions: 


Let P be the transition probability matrix of a homogeneous Markov chain {X,, n > 0}. If there 
exists a probability vector p such that 


pP=)p (5.52) 
then p is called a stationary distribution for the Markov chain. Equation (5.52) indicates that a sta- 
tionary distribution p is a (left) eigenvector of P with eigenvalue 1. Note that any nonzero multiple of p 


is also an eigenvector of P. But the stationary distribution p is fixed by being a probability vector; 
that is, its components sum to unity. 


G. Limiting Distributions: 


A Markov chain is called regular if there is a finite positive integer m such that after m time-steps, 
every state has a nonzero chance of being occupied, no matter what the initial state. Let A > O 
denote that every element a,, of A satisfies the condition a;,; > 0. Then, for a regular Markov chain 
with transition probability matrix P, there exists an m > 0 such that P” > O. For a regular homoge- 
neous Markov chain we have the following theorem: 


THEOREM 5.5.1 


Let {X,, n= 0} be a regular homogeneous finite-state Markov chain with transition matrix P. 
Then 


lim P" = P (5.53) 


no 


where P is a matrix whose rows are identical and equal to the stationary distribution p for the 
Markov chain defined by Eq. (5.52). 


5.6 POISSON PROCESSES 
A. Definitions: 


Let ¢ represent a time variable. Suppose an experiment begins at t = 0. Events of a particular 
kind occur randomly, the first at T,, the second at T,, and so on. The r.v. 7; denotes the time at which 
the ith event occurs, and the values ¢; of 7; (i = 1, 2,...) are called points of occurrence (Fig. 5-1). 
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Let Z, =T,-T,-y (5.54) 


and T,=0. Then Z, denotes the time between the (n — t)st and the nth events (Fig. 5-1). The 
sequence of ordered r.v.’s {Z,, n = 1} is sometimes called an interarrival process. If all r.v.’s Z, are 
independent and identically distributed, then {Z,, 1 = 1} is called a renewal process or a recurrent 
process. From Eq. (5.54), we see that 


T, = 2Z2,+2Z,+°°°+Z, 


where 7, denotes the time from the beginning until the occurrence of the nth event. Thus, {T,, » = 0} 
is sometimes called an arrival process. 


B. Counting Processes: 


A random process {X(t), t = 0} is said to be a counting process if X(t) represents the total number 
of “events” that have occurred in the interval (0, t). From its definition, we see that for a counting 
process, X(t) must satisfy the following conditions: 


1. X(t) > O0and X(0) = 0. 

2. X(t) is integer valued. 

3, X(s)< X(t) ifs <t. 

4. X(t) — X(s) equals the number of events that have occurred on the interval (s, #). 


A typical sample function (or realization) of X(t) is shown in Fig, 5-2. 

A counting process X(t) is said to possess independent increments if the numbers of events which 
occur in disjoint time intervals are independent. A counting process X(t) is said to possess stationary 
increments if the number of events in the interval (s + h, t + h}—that is, X(t + h) — X(s + h}—has the 
same distribution as the number of events in the interval (s, t}that is, X(t) — X(s)}—for all s < t and 
h> 0. 


Fig. 5-2. A sample function of a counting process, 


C. Poisson Processes: 


One of the most important types of counting processes is the Poisson process (or Poisson counting 
process), which is defined as follows: 
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DEFINITION 5.6.1 
A counting process X(t) is said to be a Poisson process with rate (or intensity) A( > 0) if 
1. X(0) =0. 
2. X(t) has independent increments. 


3. The number of events in any interval of length ¢ is Poisson distributed with mean At; that is, for 
alls,t > 0, 


P[X(t +s) — X(s) =n] =e n=0,1,2,... (5.55) 


It follows from condition 3 of Def. 5.6.1 that a Poisson process has stationary increments and that 
EL X(t)] = At (5.56) 
Then by Eq. (2.43) (Sec. 2.7C), we have 
Var[X(t)] = At (5.57) 


Thus, the expected number of events in the unit interval (0, 1), or any other interval! of unit length, is 
just A (hence the name of the rate or intensity). 
An alternative definition of a Poisson process is given as follows: 


DEFINITION 5.6.2 
A counting process X(t) is said to be a Poisson process with rate (or intensity) A(> 0) if 
1. X(0) =0. 
2. X(t) has independent and stationary increments. 
3, PEX(t + At) — X(t) = 1] =A At + o(At) 
4. PLX(t + At) — X(t) = 2] = o(Ad) 
where o(At) is a function of At which goes to zero faster than does At; that is, 


. o(At) 
lim ——=0 5.58 
avo Af 058) 


Note: Since addition or multiplication by a scalar does not change the property of approaching zero, 
even when divided by At, o(At) satisfies useful identities such as o(At) + o(At) = o(At) and 
ao(At) = o(At) for all constant a. 


It can be shown that Def. 5.6.1 and Def. 5.6.2 are equivalent (Prob. 5.49). Note that from condi- 
tions 3 and 4 of Def. 5.6.2, we have (Prob. 5.50) 


PLX(t + At) — X(t) = OJ = 1 —A At + o(At) (5.59) 


Equation (5.59) states that the probability that no event occurs in any short interval approaches unity 
as the duration of the interval approaches zero. It can be shown that in the Poisson process, the 
intervals between successive events are independent and identically distributed exponential r.v.’s 
(Prob. 5.53). Thus, we also identify the Poisson process as a renewal process with exponentially 
distributed intervals. 

The autocorrelation function R,(t, s) and the autocovariance function K,(t, s) of a Poisson 
process X(t) with rate A are given by (Prob. 5.52) 


Ry{t, s) = A min(t, s) + A?ts (5.60) 
K,(t, s) =A min(t, s) (5.61) 
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5.7 WIENER PROCESSES 
Another example of random processes with independent stationary increments is a Wiener process. 


DEFINITION 5.7.1 


A random process { X(t), t = 0} is called a Wiener process if 


1. X(t) has stationary independent increments. 

2. The increment X(t} — X(s) (t > s) is normally distributed. 
3. ELX(t)] =0. 

4. x(0)=0. 


The Wiener process is also known as the Brownian motion process, since it originates as a model for 
Brownian motion, the motion of particles suspended in a fluid. From Def. 5.7.1, we can verify that a 
Wiener process is a normal process (Prob. 5.61) and 


E[X(t)] =0 (5.62) 
Var[X(t)] = 02t (5.63) 


where o” is a parameter of the Wiener process which must be determined from observations. When 
o* = 1, X(t) is called a standard Wiener (or standard Brownian motion) process. 

The autocorrelation function R,(t, s) and the autocovariance function Ky(t,s) of a Wiener 
process X(t) are given by (see Prob. 5.23) 


R,(t, 8) = Ky(t, 8) = 6? min(t, s) s,t>0 (5.64) 
DEFINITION 5.7.2 
A random process {X(t), t = 0} is called a Wiener process with drift coefficient p if 


1. X(t) has stationary independent increments. 
2. X(t)is normally distributed with mean pt. 
3. X(0) =0. 


From condition 2, the pdf of a standard Wiener process with drift coefficient 4 is given by 


1 
fxi(X) = won eT ET HNEAD (5.65) 
/ 2K 


Solved Problems 


RANDOM PROCESSES 


5.1. Let X,, Xz, ... be independent Bernoulli r.v.’s (Sec. 2.7A) with P(X, = 1) = p and P(X, = 0) = 
q=1-—p for all n. The collection of 5.v.’s {X,, n = 1} is a random process, and it is called a 
Bernoulli process. 

(a) Describe the Bernoulli process. 
(b) Construct a typical sample sequence of the Bernoulli process. 


(a) The Bernoulli process {X,, n 2 1} is a discrete-parameter, discrete-state process. The state space is 
E = (0, 1}, and the index set is T = {1, 2,...}. 
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5.2. 


5.3. 


(6) A sample sequence of the Bernoulli process can be obtained by tossing a coin consecutively. If a head 
appears, we assign |, and if a tail appears, we assign 0. Thus, for instance, 


n 1 2 3 4 5 6 7 8 9 10 
Coin tossing H T T H H H T H H T 
x I 0 0 1 ! 1 0 1 1 0 


i] 


The sample sequence {x,} obtained above is plotted in Fig. 5-3. 


0 2 4 6 8 10 n 
Fig. 5-3 A sample function of a Bernoulli process. 


Let Z,, Z,, ... be independent identically distributed r.v.’s with P(Z,=1)=p and 
P(Z, = -—1)=q=1-p forall n. Let 


X,=¥.Z, n=1,2,... (5.66) 


i=1 
and X, = 0. The collection of rv.’s {X,, n = 0} is a random process, and it is called the simple 
random walk X(n) in one dimension. 
(a) Describe the simple random walk X(n). 
(b) Construct a typical sample sequence (or realization) of X(n). 
(a) The simple random walk X(n) is a discrete-parameter (or time), discrete-state random process. The 
state space is E = {..., —2, —1,0, 1, 2,...}, and the index parameter set is T = {0, 1, 2,...}. 


(b) A sample sequence x(n) of a simple random walk X() can be produced by tossing a coin every second 
and letting x() increase by unity if a head appears and decrease by unity if a tail appears. Thus, for 


instance, 
n 0 1 2 3 4 5 6 7 8 9 10 
Coin tossing H T T H H H T H H T 
x(n) 0 1 0 -1 0 1 2 1 2 3 2 


The sample sequence x(n) obtained above is plotted in Fig. 5-4. The simple random walk X(n) specified 
in this problem is said to be unrestricted because there are no bounds on the possible values of X,. 


The simple random walk process is often used in the following primitive gambling model: 
Toss a coin. If a head appears, you win one dollar; if a tail appears, you lose one dollar (see 
Prob. 5.38). 


Let {X,, n 2 0} be a simple random walk of Prob. 5.2. Now let the random process X(t) be 
defined by 


X(t)h=X, n<t<n+l 
(a) Describe X(t). 
(b) Construct a typical sample function of X(t). 


(a) The random process X(t) is a continuous-parameter (or time), discrete-state random process. The state 
space is E = {..., —2, —1,0, 1, 2,...}, and the index parameter set is T = {1, t > 0}. 


(b) A sample function x(t) of X(t) corresponding to Fig. 5-4 is shown in Fig. 5-5. 
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5.4. 


5.5. 


RANDOM PROCESSES [CHAP 5 


Fig. 5-4 A sample function of a random walk. 


Fig. 5-5 


Consider a random process X(t) defined by 
X(t) = Y cos wt t>0 
where q@ is a constant and Y is a uniform r.v. over (0, 1). 
(a) Describe X(t). 
(b) Sketch a few typical sample functions of X(0). 


(a) The random process X(t) is a continuous-parameter (or time), continuous-state random process. The 
state space is E = {x: —1 <x < 1} and the index parameter set is T = {1:1 > O}. 


(b) Three sample functions of X(t) are sketched in Fig. 5-6. 


Consider patients coming to a doctor’s office at random points in time. Let X,, denote the time 
(in hours) that the nth patient has to wait in the office before being admitted to see the doctor. 

(a) Describe the random process X(n) = {X,,n = 1}. 

(b) Construct a typical sample function of X(n). 


(a) The random process X(n) is a discrete-parameter, continuous-state random process, The state space is 
E = {x: x > 0), and the index parameter set is T = {1, 2,...}. 


(b) Asample function x(n) of X(n) is shown in Fig. 5-7. 


CHARACTERIZATION OF RANDOM PROCESSES 


5.6. 


Consider the Bernoulli process of Prob. 5.1. Determine the probability of occurrence of the 
sample sequence obtained in part (b) of Prob. 5.1. 
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x0 


x(0 


x) 
Y=0 
0 
t 
Fig. 5-6 
Since X,'s are independent, we have 
P(X, = Xy, y= XQ 4-00 Xp = Xq) = P(X, = Xy)P(X2 = X2) 7 P(X, = xy) (5.67) 


Thus, for the sample sequence of Fig. 5-3, 


P(X, = 1, X,=0, X, =0,X,=1,X5=1,X,=1, X, =0, Xg = 1, Xq = 1, X49 = 0) = p°q* 


x(n) 


ne 
> 
a 
iv.) 
io 
o 
= 
Nn 
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5.7. 


5.8. 


5.9. 
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Consider the random process X(t) of Prob. 5.4. Determine the pdfs of X(t) at t = 0, 2/4, x/2m, 
n/@. 

For tr = 0, X(0) = Y cos 0 = Y. Thus, 
I O<x<l 


Sx) = { 


0 otherwise 
For t = 2/4, X(n/4w) = Y cos 2/4 = t/,/2 ¥. Thus, 


/x 90 1/./2 
JS xinjauy) = iy <*< /2 


0 otherwise 


For ( = n/2@, X(n/2w) = Y cos n/2 = 0; that is, X(a/2w) = 0 irrespective of the value of Y. Thus, the 
pmf of X(w/2@) is 


Px(nj2uyX) = P(X =0)=1 
For t = x/@, X(x/w) = Y cos n = —Y. Thus, 


| —-l<x<0 


SF xenpa®) = ‘0 


otherwise 


Derive the first-order probability distribution of the simple random walk X(n) of Prob. 5.2. 


The first-order probability distribution of the simple random walk X(n) is given by 
p,Ak) = P(X, = k) 


where k is an integer. Note that P(X, = 0) = 1. We note that p,(k) = 0 ifn <|k{ because the simple random 
walk cannot get to level & in less than | k| steps. Thus, n = [k|. 

Let N,~ and N,~ be the r.v.’s denoting the numbers of +1s and —Is, respectively, in the first n steps. 
Then 


n=N,* +N,7 (5.68) 
X,=N," —N,7 (5.69) 

Adding Eqs. (5.68) and (5.69), we get 
N,' =4(n+ X,) (5.70) 


Thus, X, =k if and only if N,* = 4(n + k). From Eq. (5.70), we note that 2N,* =n+ X, must be even. 
Thus, X, must be even ifn is even, and X, must be odd if n is odd. We note that N,,* is a binomial r.v. with 
parameters (n, p). Thus, by Eq. (2.36), we obtain 


alk) = (., voy inehiagin M2 g@ = | —p (5.71) 


where n > |k|, and » and k are either both even or both odd. 


Consider the simple random walk X(n) of Prob. 5.2. 


(a) Find the probability that X(n) = —2 after four steps. 


(b} Verify the result of part (a) by enumerating all possible sample sequences that lead to the 
value X(n) = —2 after four steps. 


(a) Setting k = —2 andn = 4 in Eq. (5.71), we obtain 


. 4\ 3 
P(X, = —2) = py(—2) = , /P4 = 4pq q=l-p 
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x(7) 


Fig. 5-8 
(b) All possible sample functions that lead to the value X, = —2 after 4 steps are shown in Fig. 5-8. For 
each sample sequence, P(X, = —2) = pq°. There are only four sample functions that lead to the value 


X, = --2 after four steps. Thus P(X, = —2) = 4pq?. 


5.10 Find the mean and variance of the simple random walk X(n) of Prob. 5.2. 
From Eq, (5.66), we have 
X,=X,-,7+Z, n=1,2,... (5.72) 
and X, = Oand Z, (n = 1, 2,...) are independent and identically distributed (iid) r.v.’s with 
P(Z,= +l)=p P(Z,= —-Y=q=l—p 
From Eq. (5.72), we observe that 


X,=X9t+2Z,=Z, 
X,=X%,+2,=2,4+ 2, 


(5.73) 
X,=2Z,+2Z,+::°°:4+2Z, 
Then, because the Z, are iid r.v.’s and X) = 0, by Eqs. (4.108) and (4.112), we have 
E(X,,) = e( > 2,) = nk(Z,) 
k=1 
Var(X,) = var( y 2,) =n Var(Z,) 
k=1 
Now E(Z,) = (Dp + (—-Dq=p-4 (5.74) 
E(Z,”) =(1)?p +(-I?g=pt+q=l (5.75) 
Thus Var(Z,) = E(Z,”) — [E(Z,)]? = | — (p — 4)* = 4pq (5.76) 
Hence, E(X,,) = n(p — q) q=l-p (5.77) 
Var(X,,) = 4npq q=l-p (5.78) 
Note that if p = q = 4, then 
E(X,) =0 (5.79) 
Var(X,) =n (5.80) 


5.11. Find the autocorrelation function Ry(n, m) of the simple random walk X(n) of Prob. 5.2. 
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5.13. 
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From Eq. (5.73), we can express X,, as 


where Z) = Xq = Oand Z, (i = 1) are tid r.v.’s with 
P(Z,;= +i) =p P(Z;= -l)=q=1-p 
By Eq. (5.7), 
R x(n, m) = E[X(n)X(m)] = E(X, X,,) 
Then by Eq. (5.82), 


min(n, m) 


R,y(n, m) = y y E(Z, Z,) = y E(Z,?) + Y y E(Zj)E(Z,) 
i=0 k=0 i=0 i=0 k=O 


ith 
Using Eqs. (5.74) and (5.75), we obtain 
R,(n, m) = min(n, m) + [am — min(n, m)](p — q)? 


m+(nm—m\(p—g)? men 


Ry{n, m) = 
or x(n, m) ." + (nm — n\p — q)? n<m 
Note that if p = q = 4, then 


R,y(n, m) = min(n, m) nom>0O 


Consider the random process X(t) of Prob. 5.4; that is, 
X(t) = Y cos wt t>0 
where @ is a constant and Y is a uniform r.v. over (0, 1). 
(a) Find ElX()]. 
(b) Find the autocorrelation function Ry(t, s) of X(t). 
(c) Find the autocovariance function K,(t, s) of X(t). 
(a) From Eqs. (2.46) and (2.91), we have E(Y) = 4 and E(Y?) = 4. Thus 
E[{X(t)] = E(Y cos wt) = E(Y) cos wt = 4 cos wt 
(b) By Eq. (5.7), we have 
Ry(t, 8) = EL X(t)X(s)] = E(Y? cos wt cos ws) 
= E(Y*) cos wt cos ws = 5 cos wt COS ws 
(c) By Eq. (5.10), we have 


K x(t, 8) = Rylt, s) — ELX()JECX(s)] 
= 4 cos wt cos ws — + cos wt cos ws 
= js COS wt cos ws 
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(5.81) 


(5.82) 


(5.83) 


(5.84) 


(5.85) 


(5.86) 


(5.87) 


(5.88) 


Consider a discrete-parameter random process X(n) = {X,,n > 1} where the X,’s are iid r.v.'s 


with common cdf F,(x), mean p, and variance @?. 

(a) Find the joint cdf of X(n). 

(b) Find the mean of X(n). 

(c) Find the autocorrelation function Ry(n, m) of X(n). 
(d) Find the autocovariance function K ,(n, m) of X(n). 
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(a) Since the X,’s are iid r.v.’s with common cdf F,(x), the joint cdf of X(n) is given by 
Fiy(Xq, 00-5 Xp) = TT Fx) = [Fy(x)]" (5.89) 
i=1 


(b) The mean of X(n) is 
By(n) = E(X,) = ut for all n (5.90) 
(c) Ifn 3m, by Eqs. (5.7) and (5.90), 
Ry(ns, m) = E(X,X,q) = E(X,)E(Xq) = 12 
If n = m, then by Eq. (2.31), 
E(X,,”) = Var(X,) + [E(X,)]? = 0? + w? 


2 


L nzm 
Hence, Ry(n, m) = ‘i +e h=m (5.91) 
(a) By Eq. (5.10), 
0 ¥ 
K x(n, m) = Ry(n, m) — wx(n)uy(m) = \o: : _ " (5.92) 


CLASSIFICATION OF RANDOM PROCESSES 


5.14. Show that a random process which is stationary to order n is also stationary to all orders lower 
than n. 


Assume that Eq. (5.14) holds for some particular n; that is, 
P{X(t,) SX, ..., X(t,) S x,} = P{X(t, +S xy, ..., X(t, +S x} 
for any t. Letting x, — 00, we have [see Eq. (3.63)] 
P{X(t,) SXy,..-, X(th—y) S Xp—} = PLX(t, + DS xy, .., X(t. +S X43 


and the process is stationary to order n ~ |. Continuing the same procedure, we see that the process is 
stationary to all orders lower than n. 


5.15. Show that if {X(t}, t € T} is a strict-sense stationary random process, then it is also WSS. 


Since X(t) is strict-sense stationary, the first- and second-order distributions are invariant through time 
translation for all c € T. Then we have 


pt) = ELX(t)] = ELX(t + t)) = wy(t + 2) 
and hence the mean function y,{t) must be constant; that is, 
E[X(t)] = uw (constant) 
Similarly, we have 
EL X(s)X(t)] = ELX(s + 1) X(t + 2)] 


so that the autocorrelation function would depend on the time points s and t only through the difference 
|t — s|. Thus, X(z) is WSS. 


5.16. Let {X,, 1 >= 0} be a sequence of iid r.v.’s with mean 0 and variance 1. Show that {X,, n > 0} is 
a WSS process. 


By Eq. (5.90), 
E(X,,) = 9 (constant) for all n 
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and by Eq. (5.91), 
EX JE(X 454) = 9 k #0 


Ralnsn +b) = EX Xn) fry 2) = Var(X,) = | k=0 


which depends only on k. Thus, {X,} is a WSS process. 


5.17. Show that if a random process X(t) is WSS, then it must also be covariance stationary. 


If X(t) is WSS, then 
E(X(t)] = p (constant) for all t 


R(t, t + t)] = Ryft) for all ¢ 
Now Kyxlt, t+ t) = Cov[X()X(t + t)] = Rylt, t+ 1) — ELX(QELX(t + 7)] 
= R(t) — yw? 


which indicates that K ,{t, ¢ + t) depends only on t; thus, X(t) is covariance stationary. 


5.18. Consider a random process X(t) defined by 
X(t) = U cos wt + V sin wt -o<t< (5.93) 
where w is constant and U and V are r.v.’s. 
(a) Show that the condition 
E(U) = E(V) = 0 (5.94) 
is necessary for X(t) to be stationary. 
(b) Show that X(t) is WSS if and only if U and V are uncorrelated with equal variance; that is, 
E(UV) = 0 E(U?) = E(V?) = @? (5.95) 
(a) Now 
By(t) = ELX()] = E(U) cos wt + E(V) sin wt 


must be independent of t for X(t} to be stationary. This is possible only if y,(t)=0, that is, 
E(U) = E(V) = 0. 
(b) Hf X(t) is WSS, then 


ELX?(@)) = e[ (2) = Ryx(0) = 0," 
But X(0) = U and X(x/2w) = V; thus 
E(U?) = E(V*) = ay? = o? 
Using the above result, we obtain 


Rot, (+ 0) = ELX(X(t +O] 
= E{{U cos wit + V sin wt)[U cos w(t + 1) + V sin w(t + 1))} 
= 67 cos wt + E(UV) sin(2wt + wr) (5.96) 


which will be a function of t only if E(UV) = 0. Conversely, if E(UV) = 0 and E(U?) = E(V?) = o?, 
then from the result of part (@) and Eq. (5.96), we have 


Bylt) = 0 
Rt, (+t) = a? cos wt = Ry(t) 


Hence, X(t) is WSS. 
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§.19. 


5.20. 


§.21. 


Consider a random process X(t) defined by 
X(t)= Ucost+V sint -x<t< aw 


where U and V are independent r.v.’s, each of which assumes the values —2 and 1 with the 
probabilities 4 and 4, respectively. Show that X(t) is WSS but not strict-sense stationary. 


We have 


E(U) = E(V) = $(—2) + #(1) =0 
E(U?) = E(V?) = 3(—2)* + #1)? = 2 


Since U and V are independent, 
E(UV) = E(U)E(V) = 0 
Thus, by the results of Prob. 5.18, X(t) is WSS. To see if X(t) is strict-sense stationary, we consider ELX°(t)]. 


E[X3(t)] = E[(U cos t + V sin ¢)*] 
= E(U*) cos? t + 3E(U?V) cos? t sin t + 3E(UV?) cos ¢ sin? t + E(V%) sin? t 


Now E(U°) = E(V*) = $(—2)3 + 4(1)3 = —2 
E(U?V) = E(U*)E(V) =0 E(UV) = E(U)E(V) =0 
Thus E(X(t)] = —2(cos? ¢ + sin? #) 


which is a function of t. From Eq. (5.16), we see that all the moments of a strict-sense stationary process 
must be independent of time. Thus X(t) is not strict-sense stationary. 


Consider a random process X(t) defined by 
X(t) = A cos(@t + ©) —-am<t< ow 
where A and @ are constants and © is a uniform r.v. over (—2, 2). Show that X(t) is WSS. 
From Eq. (2.44), we have 


—n<@<n 
Sol9) = \2n 
0 otherwise 
A mn 
Then Hy(t) = on { cos(wt + 8) d8=0 (5.97) 


Setting s = t + t in Eq. (5.7), we have 


Az n 

Ryylt, t+ 0 = on { cos(@t + 8) cos[w(t + t) + 6)] dé 
A? [™ 1 

== 5 [cos wt + cos(2wt + 28 + wr)] db 


1 


= — COS WT (5.98) 


Since the mean of X(t) is a constant and the autocorrelation of X(t) is a function of time difference only, we 
conclude that X(t) is WSS. 


Let {X(t), t = 0} be a random process with stationary independent increments, and assume that 
X(0) = 0. Show that 


ELX(t)] = ait (5.99) 
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where yw, = ELX(1)]. 


Let S() = ELX()] = ELX() — X(0)] 


Then, for any ¢ and s and using Eq. (4./08) and the property of the stationary independent increments, we 
have 


f(t +s) = ELX(t +s) — X(0)] 
= E[X(t + s) — X(s) + X(s) — X(0)] 
= E[X(t + s) — X(s)] + E[X(s) — X(0)] 
= E[X(t) — X(0)] + E[X(s) — X(0)] 
=f() + f(s) (5.100) 


The only solution to the above functional equation is f(t} = ct, where c is a constant. Since c = (1) = 
E([X(1)], we obtain 


ELX@] =a wy = ELX(1)] 


Let {X(t), t = 0} be a random process with stationary independent increments, and assume that 
X(0) = 0. Show that 


(a) Var[X(t)] = 042t (5.101) 
(b) Var[ X(t) — X(s)] = 0,2(t—s) ts (5.102) 
where o,? = Var[X(1)]. 


(a) Let g(t) = Var X(t)] = Var X(t) — X(0)] 
Then, for any t and s and using Eq. (4.112) and the property of the stationary independent increments, 
we get 
g(t + s) = Var[X(t + s) — X(0)] 
= Var[X(t + s) — X(s) + X(s) — X(0)] 
= Var{X(t + s) — X(s)] + Var(X(s) — X(0)] 
= Var(X(t) — X(0)] + Var[X(s) — X(0)] 
= g(t) + g(s) 


which is the same functional equation as Eq. (5.100). Thus, g(t) = kt, where k is a constant. Since 
k = g(1) = Var[X(1)], we obtain 


Var[X(t)] = 0,78 o,* = Var[X(1)] 
(b) Lett > s. Then 


Var[X(t)] = Var[X(t) — X(s) + X(s) — X(0)] 
= Var[ X(t) — X(s)] + Var[X(s) — X(0)] 
= Var[X(t) — X(s)] + Var[X(s)] 


Thus, using Eq. (5./01), we obtain 
Var[X(t) — X(s)] = Var X(t)] — Var[X(s)] = 0,7(¢ — s) 


Let {X(t), t 2 0} be a random process with stationary independent increments, and assume that 
X(0) = 0. Show that 


Cov[ X(t), X(s)] = Ky(t, 8) = 6,7 min(t, s) (5.103) 
where o,” = Var[X(1)]. 
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By definition (2.28), 


Var[ X(t) — X(s)] = E({X(t) ~ X(s) ~ ELX(t) — X(s)]}’) 

= EL({ X(t) — E[X(t)]} — {X(s) — E[X(s)]})7] 
= E({X(t) — E[X(t)]}? — 2{X(t) — ELX(0)]}{X(s) — ELX(s)]} + {X(s) — ELX(s)]}?) 

= Var[X(t)] — 2 Cov[ X(t), X(s)] + Var X(s)] 

Thus, Cov[X(t), X(s)] = 4{Var[X()] + Var X(s)] — Var[ X(t) — X(s)]} 

Using Eqs. (5.101) and (5.102), we obtain 
_ fze,7[t+s—(t-s)]=0,7s  1>s 
Kx =e =o,*t s>t 


or K y(t, s) = 0,7 min(t, s) 


where o,7 = Var[X(1)]. 


5.24. (a) Show that a simple random walk X(n) of Prob. 5.2 is a Markov chain. 
(b) Find its one-step transition probabilities. 
(a) From Eq. (5.73) (Prob. 5.10), X(n) = {X,, " = 0} can be expressed as 


where Z, (n = 1, 2,...) are iid r.v.’s with 
P(Z, =k) =a, (k=1, -1) and a,=p a_,;=q=l1-p 
Then X(n) = {X,, n > 0} is a Markov chain, since 


PUXn+1 = tne | Xo = 9X1 = bys Xq = dn) 
= P(Zy44 +i, = ty441X%o = 0, X, =1y,-+., X, = i,) 
= P(Zn41 = inn — 1) = ie = P(X a = ing | Xn = ) 


tna tn 


since Z,,4, is independent of Xo, X,,..., X,- 
(b) The one-step transition probabilities are given by 


p k=j+l 
Pu = P(X,=k)X,-,=f)=\qel—-p k=j-1 
0 otherwise 


which do not depend on n. Thus, a simple random walk X(n) is a homogeneous Markov chain. 


5.25. Show that for a Markov process X(t), the second-order distribution is sufficient to characterize 
X(t). 
Let X(t) be a Markov process with the nth-order distribution 
FylX qs X25 eee) Xp 3 lay bay wees by) = P{X(t,) S xy, X (tq) S XQ, ..., X(t) SX, 
Then, using the Markov property (5.26), we have 


FryX 1, Xq0 0-03 Xqb Lys bay ees Oy) = PLX(t,) S x, | X(t) S Xy, X(t2) S XQ, ~--, X(th- 1) S Xy-1} 
x P{X(t,) <x, X(tz) S Xa, ---, X(tn_-1) S Xn_ a} 
= P{X(t,) s Xy| X(ln—1) Ss Xn ip F x(X 4, see Xnos fy, fey ty-1) 


Applying the above relation repeatedly for lower-order distribution, we can write 


Fey(X pp Xq5 0029 Xai bay bas ces Oy) = Py, Oy) Il PLX(t) SX X(t. 1) S X41} (5.104) 
K=2 
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Hence, all finite-order distributions of a Markov process can be completely determined by the second-order 
distribution. 


Show that if a normal process is WSS, then it is also strict-sense stationary. 


By Eq. (5.29), a normal random process X(t) is completely characterized by the specification of the 
mean E[X(t)] and the covariance function K,(t, s) of the process. Suppose that X(t) is WSS. Then, by Eqs. 
(5.21) and (5.22), Eq. (5.29) becomes 


YF xtry) Xl@1y o++) D_) = exp 5 Y ua, — = > y K y(t; — 1o,o4} (5.105) 
Po L k=1 
Now we translate all of the time instants ¢,, ¢,, ..., ¢, by the same amount t. The joint characteristic 
function of the new r.v.’s X(t; + 7), i = 1, 2,...,.n, is then 


W xt te Xlipt (Oy vs Oy) = exp Y xa; - y KyLt +1 —(t, + t)]@; a} 


f=1 2 i=1k=1 
a 1 n 
= exp Y wo, — 5. > YK lt; ~ §)@; o,| 
i=. i=l k=1 
=P yay x(q l@ys oes Da) (5.106) 


which indicates that the joint characteristic function (and hence the corresponding joint pdf) is unaffected by 
a shift in the time origin. Since this result holds for any n and any set of time instants (¢; € T, i= 1, 2,..., n), 
it follows that if a normal process is WSS, then it is also strict-sense stationary. 


Let {X(t}, —oc <t< oo} be a zero-mean, stationary, normal process with the autocorrelation 
function 
[t| 


1-— -~-T<t<T 
R,(t) = T <rs (5.107) 


0 otherwise 


Let {X(t,), i= 1, 2,..., n} be a sequence of n samples of the process taken at the time instants 


T 
ais i=1,2,...,n 
Find the mean and the variance of the sample mean 
1 n 
A, =~ Y X(t) (5.108) 
i=] 
Since X(t) is zero-mean and stationary, we have 
E[X(t)] = 0 
T 
and Ry(t;, t) = ELX(t)X(t,)] = Rylt, — t)) = ral — i) 4 
Thus E(,) = x y X(t | 7 L E[X(t)] =0 (5.109) 
and Var(é,) = E{Li, — E(i,)]?} = E(A,”) 


_ ey? Exel? E xu} 


1 ud Md ] a fn 
ee, 2 AXXH= 3 EY a —i) q] 
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By Eq. (5.107), 


1 k=i 
Ry{(k-)T/2]=44 |k~-il=} 
0 |k-it}>2 
Thus Var(ji,) = _ [n(1) + 2(n ~ 1)(4) + 0] = -; (2n — 1) (5.110) 


DISCRETE-PARAMETER MARKOV CHAINS 


5.28. Show that if P is a Markov matrix, then P” is also a Markov matrix for any positive integer n. 


Pir Pr2 Pim 
Let p= [pil — | Ps2 P22 Pam 
Pmt Pm2 a Pam 


Then by the property of a Markov matrix [Eq. (5.35)], we can write 


Pry Piz Pim || | ! 

Pr2 P22 Pam || ! _ 1 

Prat Pmt Pmm SL! ! 
or Pa=a (5.111) 
where a’=f[l 1 +; 1] 


Premultiplying both sides of Eq. (5.711) by P, we obtain 
P?a = Pa=a 
which indicates that P? is also a Markov matrix. Repeated premultiplication by P yields 
Plaza 


which shows that P’ is also a Markov matrix, 


5.29. Verify Eq. (5.39); that is, 
p(n) = p(O)P" 


We verify Eq. (5.39) by induction. If the state of XQ is i, state X, will be j only if a transition is made 
from i to j. The events {X = i, i = 1, 2,...} are mutually exclusive, and one of them must occur. Hence, by 
the law of total probability [Eq. (7.44)], 


P(X, =) = P(Xo = iP(X, =j|Xo =i) 
or PA =Y Pp; f= be (5.112) 
In terms of vectors and matrices, Eq. (5.1/2) can be expressed as 


p(1) = pO)P (5.113) 
Thus, Eq. (5.39) is true for n = 1. Assume now that Eq. (5.39) is true for n = k; that is, 


p(k) = p(0)P* 
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Again, by the law of total probability, 
PUK =D = YL PX = P(X = HX =) 


or p{k+ l= X pAkipyy =f =1,2,... (5.114) 


In terms of vectors and matrices, Eq. (5.114) can be expressed as 
p(k + 1) = p(k)P = p(0)P*P = p(0)P**? (5.115) 
which indicates that Eq. (5.39) is true for k + 1. Hence, we conclude that Eq. (5.39) is true for alln 2 1. 


5.30. Consider a two-state Markov chain with the transition probability matrix 


l-a a 
p=| b an O0<a<10<b<1 (5.116) 


(a) Show that the n-step transition probability matrix P” is given by 


nt b a i a —a 
rat ft Jeu-awf_¢ ~<f sn 


(b) Find P" when n— oo. 


(a) From matrix analysis, the characteristic equation of P is 

A-(l1—-a) —a 
—b A—(1— 5) 

=(A-~D)A-1+a+b)=0 


c(4) = [Al — Pl = 


Thus, the eigenvalues of P are A, = | and A, = 1 —a—b. Then, using the spectral decomposition 
method, P" can be expressed as 


P" = A,"E, + A,"E, (5.118) 


where £, and E, are constituent matrices of P, given by 


E, = 


(P-A,1 E, (P-AavJ (5.119) 


aay 


A, — A, 


Substituting 2, = 1 and A, = 1 — a — b in the above expressions, we obtain 


l b a 1 a —-a 
E,= E, =—— 
' alt ‘| , —l-t ‘| 


Thus, by Eq. (5.118), we obtain 
PP = E,+(1—a—b)"E, 


- 14? “Wh4 py} 2 78 5.120) 
~atbilb a ° —b  »b C. 
(b+) IfO0<a<1,0<b<1,then0<1—a<1and|!—a-—b|<1.Solim,.,(1 — a— 5)" =0and 


l b a 
lim Pp” = ——— 5.12 
oe lt "| (5.421) 


Note that a limiting matrix exists and has the same rows (see Prob. 5.47). 


5.31. An example of a two-state Markov chain is provided by a communication network consisting of 
the sequence (or cascade) of stages of binary communication channels shown in Fig. 5-9. Here X, 
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=] }~h xX =t 
Fig. 5-9 Binary communication network. 


denotes the digit leaving the nth stage of the channel and X, denotes the digit entering the first 
stage. The transition probability matrix of this communication network is often called the 
channel matrix and is given by Eq. (5.116); that is, 


l-a a 
p=| b 6 O0<a<10<b<1 


Assume that a = 0.1 and b = 0.2, and the initial distribution is P(X, = 0) = P(X, = 1) = 0.5. 


(a) Find the distribution of X,. 
(b) Find the distribution of X, when n > ©. 


(a) The channel matrix of the communication network is 
9 Q. 
p= 0.9 0.1 
0.2 0.8 
and the initial distribution is 


po) = [0.5 0.5] 


By Eq. (5.39), the distribution of X,, is given by 


0.9 O17 
p(n) = p0)P” = [0.5 asi 9° ua 


Letting a = 0.1 and b = 0.2 in Eq. (5.117), we get 


3 oa 1 es 0.1 -S 0.1 —-0.1 
I> oa ~ 03 es a | 0.3 | —0.2 0.2 
re: (0.7)" 1 —(0.7)" 
3 
2- 20 2 ~ 2(0.79" 1+ 1 + 210.7" 
“3 
Thus, the distribution of X, is 
2+(0.7)"" 1-(0.7) 
Pin) = (0.5 0.5] 


2-207)" 1420.7) 
3 


-f2_ 7 1, a 
-[5 6 eal 
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that is, 
2 (0.7) 1 (0.7) 
P(X, = 0) =< - d P(X, =1)=- 
(X,, = 0) 3 6 an (X, = 1) 3 6 


(b) Since lim,..,,(0.7)" = 0, the distribution of X,, when n — oo is 


PX, =0=% and P(X, =l=4 


Verify the transitivity property of the Markov chain; that is, ifi + j and j > k, then i k. 


By definition, the relations i j and j-» k imply that there exist integers n and m such that p,” > 0 
and p,” > 0. Then, by the Chapman-Kolmogorov equation (5.38), we have 


Perm =F a” Dal” = pis Pa” > 0 (5.122) 


Therefore i — k. 


Verify Eq. (5.42). 


If the Markov chain {X,} goes from state i to state j in m steps, the first step must take the chain from i 
to some state k, where k # j. Now after that first step to k, we have m — 1 steps left, and the chain must get 
to state j, from state k, on the last of those steps. That is, the first visit to state j must occur on the (m — 1)st 
step, starting now in state k. Thus we must have 

Sf = Pw wD m= 2, 3,... 


k#j 


Show that in a finite-state Markov chain, not all states can be transient. 


Suppose that the states are 0, 1,..., m, and suppose that they are all transient. Then by definition, after 
a finite amount of time (say 7,), state 0 will never be visited; after a finite amount of time (say 7,), state 1 
will never be visited; and so on. Thus, after a finite time T = max{To, T,, ..., T,,}, no state will be visited. 
But as the process must be in some state after time T, we have a contradiction. Thus, we conclude that not 
all states can be transient and at least one of the states must be recurrent. 


A state transition diagram of a finite-state Markov chain is a line diagram with a vertex corre- 
sponding to each state and a directed line between two vertices i and j if pj; > 0. In such a 
diagram, if one can move from i and j by a path following the arrows, then i j. The diagram is 
useful to determine whether a finite-state Markov chain is irreducible or not, or to check for 
periodicities. Draw the state transition diagrams and classify the states of the Markov chains 
with the following transition probability matrices: 


0 05 05 : ; . _ 
(a) P=|05 0 05 (b+) P= 
0.5 0.5 0 0! 0 0 
0 1 0 0 
03 04 O 0 0.3 
0 j 0 0 0 
(c) P=] 0 0 0 06 04 
0 0 0 0 j 
0 0 1 0 0 
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(a) (hb) (ce) 


Fig. 5-10 State transition diagram. 


(a) The state transition diagram of the Markov chain with P of part (a) is shown in Fig. 5-10(a). From Fig. 
5-10(a), it is seen that the Markov chain is irreducible and aperiodic. For instance, one can get back to 
state 0 in two steps by going from 0 to | to 0. However, one can also get back to state 0 in three steps 
by going from 0 to 1 to 2 to 0. Hence 0 is aperiodic. Similarly, we can see that states | and 2 are also 
aperiodic. 

(b) The state transition diagram of the Markov chain with P of part (b) is shown in Fig. 5-10(b). From Fig. 
5-10(b), it is seen that the Markov chain is irreducible and periodic with period 3. 

(c) The state transition diagram of the Markov chain with P of part (c) is shown in Fig. 5-10(c). From Fig. 
5-10(c), it is seen that the Markov chain is not irreducible, since states 0 and 4 do not communicate, 
and state | is absorbing. 


§.36. Consider a Markov chain with state space {0, 1} and transition probability matrix 


top pee 
nue © 
ie 


(a) Show that state 0 is recurrent. 
(b) Show that state 1 is transient. 
(a) By Eqs. (5.41) and (5.42), we have 
Soo! = Poo = | fio = Pio =4 


foo” = Pot fio” =(0)3 =0 
foo” = 0 n2>2 


Then, by Eqs. (5.43), 
foo = P(Tp < [XQ = 0) = »y Soo® =140404---=1 
n=0 


Thus, by definition (5.44), state 0 is recurrent. 
(b) Similarly, we have 
fu? =Pu=t for’ = Po, = 0 
fir? = pio for? = (3)0 = 0 
S1"=0 n>2 
and fir = PUT, < [Xp = N= VY fyMH=F4 0+ 0+ = 4 <1 
n=0 


Thus, by definition (5.48), state 1 is transient. 
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5.37. Consider a Markov chain with state space {0, 1, 2} and transition probability matrix 


0 7 72 
P={1 0 0 
1 0 0 
Show that state 0 is periodic with period 2. 
The characteristic equation of P is given by 
A -3 7-3 
e(a) = |Ar— P| =] -1 A 0|/=43-2=0 
-] 0 A 


0 +4 $4/0 3 4 1 0 0 
PPM = PPR=1)] 0 Off1 0 O}]=]/0 3 4 
1 o ojLi o of Lo 2 3 
0 3 3 
pero. p=l1 0 0 
1 0 0 
Therefore d(0) = ged{n = 1: poo'” > 0} = ged{2, 5, 6,...} = 2 


Thus, state 0 is periodic with period 2. 
Note that the state transition diagram corresponding to the given P is shown in Fig. 5-11. From Fig. 


5-11, it is clear that state 0 is periodic with period 2. 


nNi— 
nie 


Fig. 5-11 


5.38. Let two gamblers, A and B, initially have k dollars and m dollars, respectively. Suppose that at 
each round of their game, A wins one dollar from B with probability p and loses one dollar to B 
with probability g = 1 — p. Assume that A and B play until one of them has no money left. (This 
is known as the Gambler’s Ruin problem.) Let X,, be A’s capital after round n, where n = 0, 1, 
2,...and Xg =k. 


(a) Show that X(n) = {X,,, n > 0} is a Markov chain with absorbing states. 
(b) Find its transition probability matrix P. 
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(a) The total capita! of the two players at all times is 
k+m=N 
Let Z, (n > 1) be independent r.v.’s with P(Z, = 1) =p and P(Z, = —1)=q=1-—p for all n. 
Then 
X,=X,-,+Z, n=1,2,... 


and X, =k. The game ends when X, = 0 or X, = N. Thus, by Probs. 5.2 and 5.24, X(n) = {X,,n 20} 
is a Markov chain with state space E = {0, 1, 2,..., N}, where states 0 and N are absorbing states. The 
Markov chain X(n) is also known as a simple random walk with absorbing barriers. 


(b) Since 
Pits, = P(Xn4, =i + 1[X, =) =p 
Pii-i = P(X,., =i-1[X,==q 
Pi P(Xne, = 1X, =) =O ifxOQN 
Poo = AX,+1 = 9/X, = 9 = 1 


Puw = PUX a+) =N|X,=N)=1 


the transition probability matrix P is 


1 0 0 0 
q p 90 0 
0 0 0 
P=|-: — : (5.123) 
0 0 0 0 q 0 p 
0 0 0 0 0 Oo ft 


For example, when p = q = 3 and N = 4, 


100 00 
4 0 4 00 
P=|0 4 0 £0 
00 40 4 
00001 


5.39. Consider a homogeneous Markov chain X(n) = {X,, n = 0} with a finite state space E = {0, |, 
..., N}, of which A = {0, 1,..., m}, m > 1, is a set of absorbing states and B = {m+ 1,..., N} is 
a set of nonabsorbing states. It is assumed that at least one of the absorbing states in A is 
accessible from any nonabsorbing states in B. Show that absorption of X(n) in one or another of 
the absorbing states is certain. 


If X € A, then there is nothing to prove, since X(n) is already absorbed. Let Xg € B. By assumption, 
there is at least one state in A which is accessible from any state in B. Now assume that state k € A is 
accessible from j € B. Let ny (<cc) be the smallest number n such that Pye > 0. For a given state j, let n, 
be the largest of n,, as k varies and n’ be the largest of n, as j varies. After n’ steps, no matter what the initial 
state of X(n), there is a probability p > 0 that X(n) is in an absorbing state. Therefore 

P(X, € Bh =1—p 
and 0 < 1 — p <1. It follows by homogeneity and the Markov property that 


PEX een € BY =(1—ph k= 1,2... 
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Now since lim, ,.,(1 — p)* = 0, we have 


lim P{X,¢ B} =0 or lim P{X,¢€ B= A} =1 


nw nwo 


which shows that absorption of X(n) in one or another of the absorption States is certain. 


Verify Eq. (5.50). 


Let X(n) = {X,, n = 0} be a homogeneous Markov chain with a finite state space E = {0, 1,..., N}, of 
which A = {0, 1, ..., m}, m > 1, is a set of absorbing states and B = {m + 1,..., N} is a set of nonabsorbing 
states. Let state k € B at the first step go to i e E with probability p,,. Then 

uy; = P(X, =j(€ A)| Xo = ke B)} 


i 


N 
= VY pu P(X, = HE A) Xo = i} (5.124) 
i=t 
1 i=j 
Now P{X, =j(€ A), Xp = if = 40 ie A,i Fj 
ui; ie Bi=m4+1,...,N 


Then Eq. (5.124) becomes 


N 
Uys = Pag t Prt k=m4+1,...,.Nij=Hl,...,m (5.125) 


i=mti! 


But pj, kK =m-+ 1,...,N3j = 1,..., m, are the elements of R, whereas p,,,k =m+1,....N;i=m+1,..., 
N are the elements of Q [see Eq. (5.49a)]. Hence, in matrix notation, Eq. (5.125) can be expressed as 


U=R+QU or (-QU=R (5.126) 
Premultiplying both sides of the second equation of Eq. (5.126) with (I — Q)~', we obtain 
U=(1-Q)'R=OR 


Consider a simple random walk X(n) with absorbing barriers at state 0 and state N = 3 (see 
Prob, 5.38). 


{a) Find the transition probability matrix P. 
(b) Find the probabilities of absorption into states 0 and 3. 


(a) The transition probability matrix P is [Eq. (5.123)] 


0 1 2 3 

Oo}! 0 0 0 
p=! q 0 p O 
2/0 q O p 
3}0 0 0 1 


(b) Rearranging the transition probability matrix P as [Eq. (5.49a)], 


0 3 1 2 
Oo}! 0 0 0 
31,0 1 0 0 
P= 
l}q 0 O p 
210 p gq 9 
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5.43. 


and by Eq. (5.495), the matrices Q and R are given by 


n=[Pr Po |_| "| o=|P P|? | 
20 Pas O p Pai P22 q 0 


lo - 
Then 1-9-[ 4 


and @=(1~gy3=— } 4 (5.127) 


By Eq. (5.50), 


1 fi 0 , 
U = [iss ‘| -OR= | "| ; | __! E P | (5.128) 
Uo U3 1—pqalq !JLO pl) 1-pqlq’ p 


Thus, the probabilities of absorption into state 0 from states | and 2 are given, respectively, by 


2 


“yo = and Uyg = 


1 — pq 


and the probabilities of absorption into state 3 from states | and 2 are given, respectively, by 


| — pq 


2 


uy = and U3 


1 — pq l= pq 
Note that 
qtp?  1—p+p? 
Wig t yy = ——— = — = 1 
rer" t= pq — pl —p) 
2 2 
+ +(1— 
loo + Upy = 1? 8 (=, 


l—pq 1-(1—q)q — 
which confirm the proposition of Prob. 5.39. 


Consider the simple random walk X(n) with absorbing barriers at 0 and 3 (Prob. 5.41). Find the 
expected time (or steps) to absorption when X, = 1 and when Xq = 2. 


The fundamental matrix ® of X(n) is [Eq. (5.127)] 


_ P11 y2 _ J } ’] 
° en oe] l—pqiq 1 


Let 7; be the time to absorption when X, =i. Then by Eq. (5.51), we get 


1 
E(T,) = T=pa (1 + p) E(T,) = (q+ 1) (5.129) 


1 — pq 


Consider the gambler’s game described in Prob. 5.38. What is the probability of A’s losing all his 
money? 


Let P(k), k =0, 1, 2,..., N, denote the probability that A loses all his money when his initial capital is 
k dollars. Equivalently, P(k) is the probability of absorption at state 0 when Xy = k in the simple random 
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walk X(n) with absorbing barriers at states O and N. Nowif0 < k < N, then 
P(k) = pP(k + 1) + qP(k—-1) k=1,2,...,N—1 (5.130) 


where pP(k + 1) is the probability that A wins the first round and subsequently loses all his money and 
qP(k — 1) is the probability that A loses the first round and subsequently loses all his money. Rewriting Eq. 
(5.130), we have 


| 
Pk + N= PUR) +2 Pk — 1) =0 k=1,2,...,N—1 (5.131) 


which is a second-order homogeneous linear constant-coeflicient difference equation. Next, we have 
P(0) = 1 and P(N) =0 (5.132) 


since if k = 0, absorption at 0 is a sure event, and if k = N, absorption at N has occurred and absorption at 
0 is impossible. Thus, finding P(k) reduces to solving Eq. (5.131) subject to the boundary conditions given 
by Eq. (5.432). Let P(k) = . Then Eq. (5.131) becomes 


1 
prt in ky Tmt 29 ptqe=l 
p p 


Setting k = | (and noting that p + q = 1), we get 


| 
Po-rptacy u(r 2) =o 
p p p 


from which we get r = | and r = q/p. Thus, 


k 
P(k) =e, + «(2) qép (5.133) 


where c, and c, are arbitrary constants. Now, by Eq. (5.132), 


P(0)=1>¢,+¢,=1 
q N 
P(N) =0¢, +a(2) =0 


Solving for c, and c,, we obtain 


—_ —(q/p)" =! 
‘1 = (a/p)" 7 1 = (a/p)" 
kh N 
Hence P(k) = (a/py — (a/py” q¥p (5.134) 
| — (q/p) 
Note that if N > k, 
1 q>p 
P(k) = (*) (5.135) 
~ p>q 
Pp 
Setting r = q/p in Eq. (5.134), we have 
re — Yr k 
P(k) = —-+ | -— 
) Por rot N 
Thus, when p = q = 3, 
P(k) = 1 k (5.136 
= N . ) 


5.44. Show that Eq. (5.134) is consistent with Eq. (5.128). 
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Substituting k = | and N = 3 in Eq. (5.134), and noting that p + q = 1, we have 


(gip) — (qipy’ _ 4p" — 9”) 
1 —(q/p)’ — (p* — 4°) 


P(1) = 


Now from Eq. (5.128), we have 


q 
= ——. = Pi] 
io 1—pq (1) 


Consider the simple random walk X(n) with state space E = {0, |, 2,..., N}, where 0 and N are 
absorbing states (Prob. 5.38). Let r.v. 7, denote the time (or number of steps) to absorption of 
X(n) when X,) =k, k =0,1,..., N. Find E(%}). 


Let Y(k) = E(T,). Clearly, ifk = 0 or k = N, then absorption is immediate, and we have 


Y(0) = Y(N) =0 (5.137) 
Let the probability that absorption takes m steps when X, = k be defined by 
P(k, m) = P(T, = m) m=1,2,... (5.138) 
Then, we have (Fig. 5-12) 
P(k, m) = pP(k + 1, m— 1) + qP(k —1,m—1) (5.139) 
and ¥(k) = E(T) = Yo mP(k, m) =p ¥ mPlk +1, m— 1) +4 ¥ mP(k — 1m —1) 
m=1 m= m=) 
Setting m — 1 = i, we get a 


¥(k) = p¥ (i+ P(k+ Ld) + qd (i+ YP(k— 1,1 
i=0 i=0 
5 iP(k — 1, i) + p> Plk + Li) +q¥ Plk— 1, i) 
i=O0 i=0 1=0 


= pYiP(k+1,i) +4 
i=0 


x(n) fk — |) 


o 1 2 3 k n 
Fig. 5-12. Simple random walk with absorbing barriers. 
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Now by the result of Prob. 5.39, we see that absorption is certain; therefore 


¥ Pk +1, i) = ¥ Pk —1, i) = 1 


i=O i=0 
Thus Y(k) = pY(kK 4+ I+ qV(k-—l)+pt+a 
or Y(k) = pY(k 4+ 1) 4+ qY¥(k—i) +1 k=1,2,....N-1 (5.140) 
Rewriting Eq. (5.140), we have 
| q 1 
Y(k+ 1)-- ¥Y(A)+-— V(kK-1)=-- (5.141) 
P P P 


Thus, finding P(k) reduces to solving Eq. (5.141) subject to the boundary conditions given by Eq. (5.137). 
Let the general solution of Eq. (5.141) be 


Y¥(k) = Yk) + ¥,(k) 


where Y,(k) is the homogeneous solution satisfying 
] 
Wk + D~— Yh) + ¥{k ~ 1) =0 (5.142) 
and Y,(k) is the particular solution satisfying 
! q 1 
Y¥(k + I —— ¥(k) +> Yk - I= -—- (5.143) 
p p p 
Let Y,(k) = ak, where « is a constant. Then Eq. (5.143) becomes 
| 1 
(k+ Va —- ka +2 (k — 1a = —- 
p p p 


from which we get a = 1/(q — p) and 


k 
Y,(k) = p#q (5.144) 
q—P 
Since Eq. (5.142) is the same as Eq. (5.131), by Eq. (5.133), we obtain 
q k 
Y¥(k) = ¢, + a(4) q#p (5.145) 
where c, and c, are arbitrary constants. Hence, the general solution of Eq. (5.141) is 
qV ik 
¥(k) =¢, + a2) +—— q#p (5.146) 
P. q—Pp 


Now, by Eq. (5.137), 
Y(0)=O0->c,+c¢,=0 


" oN 
YW) = 0-46, +42) +——=0 


Solving for c, and c,, we obtain 


. — NMG — P) co, = NAG?) 
"1 = (a/p)% > 1 = (q/p)® 
Substituting these values in Eq. (5.146), we obtain (for p # q) 
y 1 — ater |) 
Y(k) = E(T,) = —— [k — N| ——— 5.147 
(= Ah) = 7 ( fee 6.147) 


When p = q = 3, we have 
Y(kK)= ET) =KMN—-—k)  p=q=5 (5.148) 
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5.46. Consider a Markov chain with two states and transition probability matrix 


5.47. 


(b) Now P= ' J 
lo 1] 


(a) Find the stationary distribution p of the chain. 
(b) Find lim,.,, P”. 


(a) By definition (5.52), 


0 | 


or (P, P| ja P2] 


which yields p, = p,. Since p, + p, = 1, we obtain 


and lim,_.,, P" does not exist. 


Consider a Markov chain with two states and transition probability matrix 


Nit al 


(a) Find the stationary distribution p of the chain. 
(b) Find lim,_,,, P”. 
(c) Find lim,.,, P” by first evaluating P”. 


(a) By definition (5.52), we have 


341 
or (Pp: Pz] |; | =P, pol 
2 2 


which yields 
aP\ + 4p. =P, 


EP; + $P2 = Po 


Each of these equations is equivalent to p, = 2p. Since p, + p, = |, we obtain 


P=([3 3] 


(b) Since the Markov chain is regular, by Eq. (5.53), we obtain 


whe Ble 


lim P" = lim [ 


nao ate 


we ode 
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(c) Setting a = 4 and b = $ in Eq. (5.120) (Prob. 5.30), we get 


a4 1-4 
Pp" = F ‘|- er 2 ‘| 
3 3 —~i 3 


Since lim,..,, (4)" = 0, we obtain 
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5.48. Let 7, denote the arrival time of the nth customer at a service station. Let Z, denote the time 
interval between the arrival of the nth customer and the (n — 1)st customer; that is, 


n= Ty—Ty nt (5.149) 


an-l 
and Ty = 0, Let {X(t), t > 0} be the counting process associated with {7,, n > 0}. Show that if 
X(t) has stationary increments, then Z,,n = 1, 2,..., are identically distributed r.v.’s. 


We have 
P(Z, > z)=1~- P(Z, <2) =1— F,(2) 
By Eq. (5.149), P(Z, > 2)= P(T, -— T,., > 2)=P(T, > Ty +2) 
Suppose that the observed value of T,_, ist,-,. The event (7, > T,_, + z|7,_, = ¢t,—,) occurs if and only if 


X(t) does not change count during the time interval (t, - ,, ¢,-, + 2) (Fig. 5-13), Thus, 


P(Z, > 2|Th-\ =t,-,) = P(T, > The, +2|T,-; =t,-4) 
= P[X(t,_, + z) — X(t,-,) = 9] 


or P(Z, <2|T,-; =t,_,)=1 — PLX(t,_-, + 2)— X(t,_,) = OJ (5.150) 


Since X(t) has stationary increments, the probability on the right-hand side of Eq. (5.150) is a function only 
of the time difference z. Thus 


P(Z, <z| Ty) = ty-1) = 1 — P[X(z) = 0] (5-151) 


which shows that the conditional distribution function on the left-hand side of Eq. (5./51) is independent of 
the particular value of n in this case, and hence we have 


F,(2) = P(Z, <2) = 1 — P[X(z) = 0] (5.152) 


which shows that the cdf of Z, is independent of n. Thus we conclude that the Z,’s are identically distrib- 
uted r.v.’s. 


Fig. 5-13 


5.49. Show that Definition 5.6.2 implies Definition 5.6.1. 
Let p,(f) = P[X(t) = nJ. Then, by condition 2 of Definition 5.6.2, we have 


polt + At) = P[X(t + At) = 0] = P[X(t) = 0, X(t + Ad) — X(0) = 0] 
= P[X(t) = O]PLX(t + At) — X(t) = 0] 
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Now, by Eq. (5.59), we have 
P[X(t + At) — X(t) = 0) = 1 —a At + ofAd 


Thus, Po(t + At) = po(t)[1 — A At + o(At)] 
Polt + At) — pol!) _ oft) 
or ~o = —Apolt) + 


Letting At > 0, and by Eq. (5.58), we obtain 


Polt) = —Apolt) (5.153) 
Solving the above differential equation, we get 
Dolt) = ke“ ™ 
where k is an integration constant. Since p)(0) = P[X(0) = 0] = 1, we obtain 
Polt) =e (5.154) 


Similarly, for n > 0, 

p,(t + At) = P[X(t + At) =n] 

= P[X(t) =n, X(t + At) — X(0) = 0] 
+ P[X(t) =n —1, X00 + AN) — X() = 1) + Y PLX(Q Hn —k, X(t + AD ~ X(0) =k] 
k=2 
Now, by condition 4 of Definition 5.6.2, the last term in the above expression is o(At). Thus, by conditions 2 
and 3 of Definition 5.6.2, we have 
p,(t + At) = p,(t)[1 — A At + o(At)] + p,— ,(t)[A At + o(At)] + o( Ad) 


Ap = Ap,(t) + APy - i() + At 


Thus 

and letting Ar > 0 yields 
Prlt) + Ap,(t) = Ap, 1(t) (5.155) 

Multiplying both sides by e~’, we get 


eMLpi(t) + Ap,(t)] = Ae", — s(t) 
ds, . 
Hence 7 [e*tp,(t)] = Ae™p, — y(t) (5.156) 
Then by Eq. (5.154), we have 
d_, 
5 lene = 2 


or pi(t) = (At + cje* 
where c is an integration constant. Since p,(0) = P[X(0) = 1] = 0, we obtain 
p,(t) = Ate™** (5.157) 
To show that 
p,(t) =e aie 


we use mathematical induction. Assume that it is true for n — 1; that is, 


a (ary? 


Py-(t) =e in 1)! 
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Substituting the above expression into Eq. (5.156), we have 


apa 


(n — 1)! 


d At _ 
ht Le"'p,()] = 


Integrating, we get 


; ay 
e“'p,(t) = + 1 
n! 
Since p,(0) = 0, c, = 0, and we obtain 
_, (Aty” 
PAt) =e“ ar (5.158) 


which is Eq. (5.55) of Definition 5.6.1. Thus we conclude that Definition 5.6.2 implies Definition 5.6.1. 


Verify Eq. (5.59). 
We note first that X(t) can assume only nonnegative integer values; therefore, the same is true for the 
counting increment X(t + At) — X(t). Thus, summing over all possible values of the increment, we get 


x P[X(t + At) — X(t) =k] = P[X(t + AD — X(t) = 0] 
k=0 


+ P[X(t+ At) — X(t) = 1] 4+ PLX(t + At) — X(t) > 2] 
=] 
Substituting conditions 3 and 4 of Definition 5.6.2 into the above equation, we obtain 


PLX(t + At) — X(t) =0] = 1 —A Art o(Ad) 


(a) Using the Poison probability distribution in Eq. (5.758), obtain an analytical expression for 
the correction term o(Ad) in the expression (condition 3 of Definition 5.6.2) 


P[X(t + At) — X(t) = 1] = A At + of(Ad) (5.159) 
(b) Show that this correction term does have the property of Eq. (5.58); that is, 
am at 
(a) Since the Poisson process X(t) has stationary increments, Eq. (5.159) can be rewritten as 
P[X(At) = 1] = p,(At) = A At 4+ o(At) (5.160) 
Using Eq. (5.158) [or Eq. (5.157)], we have 


p(At)=4 Ate?“ = 4 Atl + e774" — 1) 
=AAr+ A At(e7?4t — 1) 


Equating the above expression with Eq. (5.60), we get 
A At + of{At) = 4 At + A Atfe7*4* — 1) 


from which we obtain 


o(At) = 4 At(e~** -~ 1) (5.161) 
(b) From Eq. (5.16), we have 
At A At(ie7*4" — 1 
tim 89 tim AONE) 8 tim yen4— 1) =0 
aro Al ar-0 At ato 


§.52. Find the autocorrelation function R,(t, s) and the autocovariance function K y(t, s) of a Poisson 


process X(¢) with rate 2. 
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5.53. 


5.54. 


5.55. 


From Eas. (5.56) and (5.57), 
E[X(t)] = at Var[X(t)] = at 


Now, the Poisson process X(t) is a random process with stationary independent increments and X(0) = 0. 
Thus, by Eq. (5.103) (Prob. 5.23), we obtain 


K,(t, s) =6,? min(t, s) = 2 min(t, s) (5.162) 
since o,? = Var[X(1)] = A. Next, since E[.X(t)]E[X(s)] = 4?ts, by Eq. (5.10), we obtain 
Ry{t, s) = 4 min(t, s) + A*ts (5.163) 


Show that the time intervals between successive events (or interarrival times) in a Poisson 
process X(t) with rate A are independent and identically distributed exponential r.v.’s with 
parameter A. 


Let Z,, Z,, ... be the r.v.’s representing the lengths of interarrival times in the Poisson process X(t). 
First, notice that {Z, > t} takes place if and only if no event of the Poisson process occur in the interval 
(0, t), and thus by Eq. (5.154), 


P(Z, >) = P{X() =0} =e"* 
or F,() = P(Z,<)=1—-e* 
Hence Z, is an exponential r.v. with parameter A [Eq. (2.49)]. Let f(t) be the pdf of Z,. Then we have 


P(Z,>0= [re > t{Z, =nf,(t) at 
= | PLX(t + t) — X(t) = O] f(t) at 


=e * [ 400 dt=e* (5.164) 


which indicates that Z, is also an exponential r.v. with parameter A and is independent of Z,. Repeating the 
same argument, we conclude that Z,, Z,,... are iid exponential r.v.’s with parameter 2. 


Let T, denote the time of the nth event of a Poisson process X(t) with rate A. Show that T, is a 
gamma r.v. with parameters (n, A). 


Clearly, 
T=Z,+Z,+-- +2, 
where Z,,n = 1, 2,..., are the interarrival times defined by Eq. (5.149). From Prob. 5.53, we know that Z, 


are iid exponential r.v.’s with parameter A. Now, using the result of Prob. 4.33, we see that T, is a gamma 
r.v. with parameters (n, 4), and its pdf is given by [Eq. (2.76)]: 


At"! 
ay (AD 1>0 


Sr{0 = ° (aD! (5.165) 
0 t<0 


The random process {T,, n = 1} is often called an arrival process. 


Suppose ¢ is not a point at which an event occurs in a Poisson process X(t) with rate A. Let W(t) 
be the r.v. representing the time until the next occurrence of an event. Show that the distribution 
of W(t) is independent of t and W(t) is an exponential r.v. with parameter 2. 


Let s (0 < 5s <1t) be the point at which the last event [say the (n — 1)st event] occurred (Fig. 5-14). The 
event {W(t) > t} is equivalent to the event 


{Z, >t—s+t|Z,>t—s} 
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Thus, using Eq. (5.164), we have 


P[W(t) >t] = P(Z,>t—s+t|Z,>t—-5) 
P(Z,>t—st) @ *erstn 
PIZ,>t—s) eM) 


and P[W(t) <1] = 1—e* (5.166) 


e74*t 


which indicates that W(t) is an exponential r.v. with parameter 4 and is independent of t. Note that W(t) is 
often called a waiting time. 


5.56. Patients arrive at the doctor’s office according to a Poisson process with rate 4 = 75 minute. The 
doctor will not see a patient until at least three patients are in the waiting room. 


(a) Find the expected waiting time until the first patient is admitted to see the doctor. 
(b) What is the probability that nobody is admitted to see the doctor in the first hour? 
(a) Let T, denote the arrival time of the nth patient at the doctor’s office. Then 
T,=Z,+2Z,4+°:°+2Z, 
where Z,,n = 1, 2,... , are iid exponential r.v.’s with parameter 4 = yy. By Eqs. (4.108) and (2.50), 


E(T,) = (5 2.) = 3 E(Z) =n : (5.167) 


The expected waiting time until the first patient is admitted to see the doctor is 
E(T;) = 3(10) = 30 minutes 
(b) Let X(t) be the Poisson process with parameter 4 = zy. The probability that nobody is admitted to see 
the doctor in the first hour is the same as the probability that at most two patients arrive in the first 60 
minutes. Thus, by Eq. (5.55), 
P[X(60) — X(0) < 2] = P[X(60) — X(0) = 0] + P[X(60) — X(0) = 1] + P[X(60) — X(0) = 2] 
= e7 60/10 + e~ 60/10/88) + e 80/104 /98)2 


=e (1 +6 + 18) = 0.062 


5.57. Let T, denote the time of the nth event of a Poisson process X(t) with rate A. Suppose that one 
event has occurred in the interval (0, t). Show that the conditional distribution of arrival time T, 
is uniform over (0, ¢). 

Fort <1, 
PLT, <1, X(t) = 1] 
P[X(t) = 1) 

_ P[X(t) = 1, X(t) — X(z) = 0) 

7 P[X(t) = 1] 

_ PLX(t) = 1PLX(t) — X(t) = 0) 

7 P[X(t) = 1] 

Ate ¥en 9 


Ate #! 


PLT, < | X(t) = 1] = 


(5.168) 


mia 


which indicates that T, is uniform over (0, t) [see Eq. (2.45)]. 
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5.58. 


Consider a Poisson process X(t) with rate A, and suppose that each time an event occurs, it is 
classified as either a type 1 or a type 2 event. Suppose further that the event is classified as a type 
1 event with probability p and a type 2 event with probability 1 — p. Let X(t) and X,(t) denote 
the number of type | and type 2 events, respectively, occurring in (0, 1). Show that {X ,(x), ¢ > 0} 
and {X,(t), ! = 0} are both Poisson processes with rates Ap and A(1 — p), respectively. Further- 
more, the two processes are independent. 


We have 

X(t) = X(t) + X(t) 

First we calculate the joint probability PLX ,(t) = k, X(t) = m). 

PLX ,(t) = k, X(t) = m] = }) P[X (2) = k, X2(t) = m| X(t) = nJPLX() = 1) 

n=0 

Note that 

P(X ,(t) = k, X,(t) = m| X(t) =n] =0 whenn#k+m 
Thus, using Eq. (5.158), we obtain 
PLX (0) = k, X(t) = m] = PLX (0) = k, X(t) = m|X(0) = k + mIPLX() = k +m] 


( erm 


= P[X,(t) =k, X,(t) = m|X() =k+mJe* 


(k +m)! 


Now, given that k + m events occurred, since each event has probability p of being a type 1 event and 
probability 1 — p of being a type 2 event, it follows that 


P[X (1) =k, X,(t) = m| X() =k +m] = (‘ 4 "i — py” 


k+m 
k 


Thus PLX (t) =k x40 =m) =( Jp — pyre GO” 

. 1 » AQ Ip Dp (k + m)! 
_(k+m) eu (ahem 
= Tm PU a pmen* (k + m)! 


=e tet (Apt) en it — pot [Ag — p)t]” 


k! m! 


(5.169) 


Then P[X,\(Q =k) = ¥ PLX,() =k, X,(t) =m] 
m=1 
~ apt APO ay —py & CAC = pe)” 
= Apt (1 —p}e 
nr a x m! 
~ apt APE skp aa 
=e tet AG PitgA(l py 
k! 
_ pn ap (Apt) 


ki (5.170) 


which indicates that X ,(¢) is a Poisson process with rate Ap. Similarly, we can obtain 


PLX,(0) =m) = SPL) =k, X90) =m] 


1 
= eal - py [AQ —~ p)t}" 
m! 


(5.171) 


and so X,(t) is a Poisson process with rate A(1 — p). Finally, from Eqs. (5.770), (5.171), and (5.169), we see 
that 
PX ,(t) = k, X2(t) = m] = PLEX (8) = kP[X,(t) = m] 


Hence, X ,(t) and X ,(t) are independent. 
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5.59. 


5.60. 


Let X,,..., X, be jointly normal r.v.’s. Show that the joint characteristic function of X,,..., X, 
is given by 


. n 1 n n 
Vxy ee x(@is +++ On) = exp YO; 4; 735 > > 2,001) (5.172) 
i= f=1 k=1 
where p,; = E(X,) and o,, = CowX,, X;,). 
Let Y¥Y=a,X,+a,X,+-:-+4,X, 


By definition (4.50), the characteristic function of Y is 
Py) = EL esos to + aeX0y) = Vy, a x,(@a,, ++, @4,) (5.173) 


Now, by the results of Prob, 4.55, we see that Y is a normal r.v. with mean and variance given by [Eqs. 
(4.108) and (4.111)] 


Hy = E(Y) = } a, E(X,) = Sati (5.174) 
i=1 i=l 


ay* = Var(Y)= ¥ ¥ a;a, Cow(X,, XJ = YY aaay, (5.175) 
f=) k=1 i=1k=1 
Thus, by Eq. (4.125), 
Yo) = expl jony — yoyo] 


= exp( jo Yau; — 70? YY aa, mn) (5.176) 
i=1 i=1k=1 
Equating Eqs. (5.176) and (5.173) and setting @ = 1, we get 
n 1 n n 
Wy xl@1y +s Qa) = eo iE aj Mi — 5 » » a; ay mn) 
i=1 f=1k=1 


By replacing a,’s with w,’s, we obtain Eq. (5.172); that is, 


A | n an 
Wye ag py 62) On) = exo( i » Hi — 3 ~ Yao, eu) 
i=1 i 


wy @ yy, 7 Gin 
Let w=]: o=| : K=[o,J=] : 
Ha Q, On Onn 
Then we can write 
3 co, By = Tp Y Va eon = OKO 


i=l j=1 k=1 
and Eq. (5.172) can be expressed more compactly as 


Wyse x(@ ps ++) On) = exp( jo — 407 Ko) (5.177) 


Let X,,..., X, be jointly normal r.v.’s Let 
Y, = 01,;X, + nes + a,,X, 
: (5.178) 
Yn = Ami X y ter + Onn Xp 
where a, (i = 1,...,m;j=1,..., n) are constants. Show that Y,, ..., ¥, are also jointly normal 
L.v.’s. 
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5.61. 


5.62. 


x, Y, Ay tt ay 


Ann, 


Then Eq. (5.1738) can be expressed as 
Y = AX (5.179) 


Hy @y Fir Oty 
Let Hx = E(X) =] : m=]: Kx = [64] = ; 


By Om, Ons uv Onn 


Then the characteristic function for Y can be written as 
Py(@ 1, --., Op) = E(ei?™) = E(eie74*) 
= E[elAToX) = py Ata) 
Since X is a normal random vector, by Eq. (5.177) we can write 
Vx(A 7a) = exp[j(A7@) "py — 3(A7@)"Ky(A7@)] 
= exp[ ja’ Apy — $0" AK, A7o] 

Thus VY, -.-, @,) = exp(ja™y — $a? Kyo) (5.180) 
where Py = Apy Ky = AK, AT (5.181) 


Comparing Eqs. (5.177) and (5.180), we see that Eq. (5.180) is the characteristic function of a random vector 
Y. Hence, we conclude that Y,,..., ¥,, are also jointly normal r.v.’s 

Note that on the basis of the above result, we can say that a random process { X(t), t ¢ T} is a normal 
process if every finite linear combination of the r.v.’s X(t,), t; € T is normally distributed. 


Show that a Wiener process X(t) is a normal process. 


Consider an arbitrary linear combination 


a, X(t) = a, X(t,) + ay X(t) + +a, X(t,) (5.182) 
i=1 


where 0 <t, <--: <¢t, and a, are real constants. Now we write 
n 

a, X(t) = (ay + +++ + aE X(t,) — XO) + (ay +0 -* + EX (t2) — X(0)] 
=1 


te # (Gy + OLX (ty 1) — X(ta~ 2) + CXC.) — X(t,— 1) (5.183) 


Now from conditions 1 and 2 of Definition 5.7.1, the right-hand side of Eq. (5.183) is a linear combination 
of independent normal r.v.’s. Thus, based on the result of Prob. 5.60, the left-hand side of Eq. (5.183) is also 
a normal r.v.; that is, every finite linear combination of the r.v.’s X(t;) is a normal r.v. Thus we conclude that 
the Wiener process X(t) is a normal process. 


A random process {X(t), t € T} is said to be continuous in probability if for every e > Oandt e€ T, 


lim P{| X(t + h) — X(1 > e} = 0 (5.184) 


h>O0 
Show that a Wiener process X(t) is continuous in probability. 
From Chebyshev inequality (2.97), we have 


PIX( +h) — X(0| > 6} s SAEED A) ->0 
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5.63. 


5.64, 


5.65. 


5.66. 


5.67. 
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Since X(t) has stationary increments, we have 
Var[X(t + h) — X(t)] = Var[X(A)] = 02h 


in view of Eq. (5.63). Hence, 


2 
lim P{LX(t + A) — X(0| >} = lim [= 0 


h-0 4-0 


Thus the Wiener process X(t) is continuous in probability. 


Supplementary Problems 


Consider a random process X(n) = {X,,n = 1}, where 
X,=Z,+Z,++:°4+Z, 
and Z, are iid r.v.’s with zero mean and variance o?. Is X(n) stationary? 


Ans. No. 


Consider a random process X(t) defined by 


X(t) = Y cos(wt + O) 


[CHAP 5 


where Y and © are independent r.v.’s and are uniformly distributed over (— A, A) and (—z, 2), respectively. 


(a) Find the mean of X(t). 
(b) Find the autocorrelation function R(t, s) of X(t). 


Ans. (a) E[X(t)] = 0; (b) Rylt, 8) = 4A? cos a(t — 5) 


Suppose that a random process X(t) is wide-sense stationary with autocorrelation 


Ry(t, (+t) =e 2 


(a) Find the second moment of the r.v, X(5). 
(b) Find the second moment of the r.v. X(5) — X(3). 


Ans. (a) E{X*(5)}=1; (6) E{LX(5) — X(3)]?} = 201 — e7?) 


Consider a random process X(t) defined by 


X(t)=U cost +(V 4+ 1)sint -o<It<a 


where U and V are independent r.v.’s for which 
E(U) = E(V) = 0 FE(U?) = E(V?) = 1 
(a) Find the autocovariance function K,(t, s) of X(t). 
(b) Is X(t) WSS? 
Ans. (a) Ky(t, 8) = cos(s — t); (b) No. 


Consider the random processes 


X(t) = Ag cos(@gt + O) Y(t) = A, cos(w,t + ®) 


where Ay, A,, wy, and w, are constants, and r.v.’s © and ® are independent and uniformly distributed over 


(—7, 7). 
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5.68. 


5.69. 


5.70. 


5.71. 


5.72. 


5.73. 


5.74. 


(a) Find the cross-correlation function of Ryyt, t + t) of X(t) and Y(t). 
(b) Repeat (a) if @ = ¢. 
Ans. (a) Ryy;(t,¢+7)=0 


A, A 
(b) Ryylt, 6+ 1) = 


cos{(@, — @o)t + @,1) 


Given a Markov chain {X,, n > 0}, find the joint pmf 
P(X9 =i9, X, =h,,.--,X, =i) 
Hint: Use Eq. (5.32). 
Ans. Pigl)Pigi Pitz "Din vin 
Let {X,, 1 = 0} be a homogeneous Markov chain. Show that 
PUXn4, = ky, ees Xnam = kml Xo = igs ey Xq = i) = P(X, = ky, 0, X= Kyl Xo = I) 
Hint: Use the Markov property (5.27) and the homogeneity property. 
Verify Eq. (5.37). 


Hint: Write Eq. (5.39) in terms of components. 


Find P” for the following transition probability matrices: 


1 0 1 0 0 1 0 0 
(a) P=[ 5 os (os) P=|0 1 0 () P=|0 1 0 
ae 00 1 0.3 02 05 


1 0 0 0 0 0 
(c) P>=10 1 Of +05") 0 0 0 
06 04 0 06 -04 1 


A certain product is made by two companies, A and B, that control the entire market. Currently, A and B 
have 60 percent and 40 percent, respectively, of the total market. Each year, A loses $ of its market share to 
B, while B loses § of its share to A. Find the relative proportion of the market that each hold after 2 years. 


Ans. A has 43.3 percent and B has 56.7 percent. 


Consider a Markov chain with state {0, 1, 2} and transition probability matrix 


0 4 4 
P=/4 0 4 
1 0 0O 


Is state 0 periodic? 
Hint: Draw the state transition diagram. 


Ans. No. 


Verify Eq. (5.51). 
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5.75. 


5.76. 


5.77. 


5.78. 


5.79, 


5.80. 


5.81. 


5.82. 
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Hint: Let N =[Ny], where Nj, is the number of times the state k(é B) is occupied until absorption takes 
place when X(n) starts in state j(€ B). Then T, = Vim. Ng; calculate E(N ,). 


Consider a Markov chain with transition probability matrix 


06 0.2 02 
P=!04 05 O1 
06 O 04 


Find the steady-state probabilities. 

Ans. p=[$ § $] 

Let X(t) be a Poisson process with rate 4. Find E[X?(s)]. 
Ans. t+ Ar? 


Let X(t) be a Poisson process with rate 4. Find E{{X( — X(s)]?} fort > s. 
Hint: Use the independent stationary increments condition and the result of Prob. 5.76. 


Ans, A(t — 8) + A(t — 8)? 


Let X(t) be a Poisson process with rate 4, Find 


P[XU—d=k|X(=j] d>0 
we aati 
™ kG—-wie JNM 


Let T, denote the time of the nth event of a Poisson process with rate A. Find the variance of T,. 


Ans. nfa 


Assume that customers arrive at a bank in accordance with a Poisson process with rate A = 6 per hour, and 
suppose that each customer is a man with probability 3 and a woman with probability 4. Now suppose 
that 10 men arrived in the first 2 hours. How many woman would you expect to have arrived in the first 2 
hours? 


Ans. 4 
Let X,,..., X, be jointly normal r.v.’s. Let 

Y= X;+¢; i= J,...,n 
where c; are constants. Show that Y,..., Y, are also jointly normal r.v.’s. 


Hint: See Prob. 5.60. 


Derive Eq. (5.63). 
Hint; Use condition (1) of a Wiener process and Eq. (5.102) of Prob. 5,22. 


Chapter 6 


Analysis and Processing of Random Processes 


6.1 INTRODUCTION 


In this chapter, we introduce the methods for analysis and processing of random processes. First, 
we introduce the definitions of stochastic continuity, stochastic derivatives, and stochastic integrals of 
random processes. Next, the notion of power spectral density is introduced. This concept enables us 
to study wide-sense stationary processes in the frequency domain and define a white noise process. 
The response of linear systems to random processes is then studied. Finally, orthogonal and spectral 
representations of random processes are presented. 


6.2 CONTINUITY, DIFFERENTIATION, INTEGRATION 


In this section, we shall consider only the continuous-time random processes. 


A. Stochastic Continuity: 
A random process X(t) is said to be continuous in mean square or mean square (m.s.) continuous if 


lim E{EX(t + e) — X(0)]7} = 0 (6.1) 
E70 
The random process X(t) is m.s. continuous if and only if its autocorrelation function is continuous 
(Prob. 6.1). If X(t) is WSS, then it is m.s. continuous if and only if its autocorrelation function R y(t) is 
continuous at t = 0. If X(t) is m.s. continuous, then its mean is continuous; that is, 


lim py(t + &) = p(t) (6.2) 
670 
which can be written as 
lim E[X(¢ + ¢)] = Eflim X(t + 8)] (6.3) 
e-0 r>0 


Hence, if X(t) is m.s. continuous, then we may interchange the ordering of the operations of expecta- 
tion and limiting. Note that m.s. continuity of X(t) does not imply that the sample functions of X(t) 
are continuous. For instance, the Poisson process is m.s. continuous (Prob. 6.46), but sample func- 
tions of the Poisson process have a countably infinite number of discontinuities (see Fig. 5-2). 


B. Stochastic Derivatives: 


A random process X(t) is said to have a m.s. derivative X’(t) if 


. X(t +e) — X(t) 
li.m. — 


6-0 


= X(t) (6.4) 


where |.i.m. denotes fimit in the mean (square); that is, 


2 
lim pi #2920 - xo | } =0 (6.5) 
+0 


& 
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The m.s. derivative of X(t) exists if d7Ry(t, s)/ét ds exists (Prob. 6.6). If X(t) has the m.s. derivative 
X(t), then its mean and autocorrelation function are given by 


a 


E[X'(t)] = 7 E[X(t)] = wx(t) (6.6) 
67R y(t, s) 
RyAt, 8) = has (6.7) 


Equation (6.6) indicates that the operations of differentiation and expectation may be interchanged. If 
X(t) is a normal random process for which the m.s. derivative X’(t) exists, then X‘(t) is also a normal 
random process (Prob. 6.10). 


C. Stochastic Integrals: 


A m.s. integral of a random process X(t) is defined by 


t 
Y(t) = [ x) do = lim. ¥° X(t;) At; (6.8) 
to Auvo i 
where to <t, <-::<tand At; =t,,,; — 4. 
The m.s. integral of X(t) exists if the following integral exists (Prob. 6.11): 


[ { R,x(a, B) da dB (6.9) 


Oo vio 


This implies that if X(t) is m.s. continuous, then its m.s. integral Y(t) exists (see Prob. 6.1). The mean 
and the autocorrelation function of Y(t) are given by 


By(t) = e [xe a = [ 2x00 da = [xt da (6.10) 
Ry(t, s) = ef [x da [xo i] 
= [ [ecxoxnn dB da = [ [ Rut B) dB da (6.11) 


Equation (6.10) indicates that the operations of integration and expectation may be interchanged. If 
X(t) is a normal random process, then its integral Y(t) is also a normal random process. This follows 
from the fact that Z, X(t,) At; is a linear combination of the jointly normal r.v.’s. (see Prob. 5.60). 


6.3 POWER SPECTRAL DENSITIES 


In this section we assume that all random processes are WSS. 


A. Autocorrelation Functions: 


The autocorrelation function of a continuous-time random process X(t) is defined as [Eq. (5.7)] 


Ry(t) = ELX()X(t + 1)] (6.12) 

Properties of R(t): 
1. Ry(—t) = Rx(t) (6.13) 
2. | Rx(t)| < Rx(0) (6.14) 


3. Rx(0) = E[X7()] = 0 (6.15) 
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Property 3 [Eq. (6./5)] is easily obtained by setting t = 0 in Eq. (6.12). If we assume that X(t) is a 
voltage waveform across a 1-Q resistor, then E[X?(t)] is the average value of power delivered to the 
1-Q resistor by X(t). Thus, E[X?(t)] is often called the average power of X(t). Properties | and 2 are 
verified in Prob. 6.13. 

In case of a discrete-time random process X(n), the autocorrelation function of X(n) is defined by 


Ry(k) = E[X(n)X(n + K)] (6.16) 


Various properties of R,y(k) similar to those of R(t) can be obtained by replacing t by k in Eqs. (6.13) 
to (6.15). 


B. Cross-Correlation Functions 


The cross-correlation function of two continuous-time jointly WSS random processes X(t) and 
Y(t) is defined by 


Ryy(t) = E[X(t) Y(t + 1)] (6.17) 

Properties of Ryy(t): 
1. Ryy(—1t) = Ryx(t) (6.18) 
2. |Rxlt)| < / Rx(0)RY9) (6.19) 
3. | Rx) < 2LRx(0) + Ry(0)] (6.20) 


These properties are verified in Prob, 6.14. Two processes X(t) and Y(t) are called (mutually) orthog- 
onal if 


Ry,(t) = 0 for all t (6.21) 


Similarly, the cross-correlation function of two discrete-time jointly WSS random processes X(n) and 
Y(n) is defined by 


Ryyk) = ELX(n)Y(n + &)] (6.22) 


and various properties of Ryy(k) similar to those of Ry,(z) can be obtained by replacing t by k in Eqs. 
(6.18) to (6.20). 


C. Power Spectral Density: 


The power spectral density (or power spectrum) S,(w) of a continuous-time random process X(t) is 
defined as the Fourier transform of R(t): 


S,(@) = | Ry{t)e~* de (6.23) 
Thus, taking the inverse Fourier transform of Sy(w), we obtain 
1 {[* ; 
R,(t) = mn | S,(m)e’* dw (6.24) 


Equations (6.23) and (6.24) are known as the Wiener-Khinchin relations. 


Properties of Sy(w): 
1. S,(w) is real and S,(@) > 0. (6.25) 
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2, S$y(—a@) = Sy) (6.26) 


3. ELX(t)) = Ry(0) -= [ S,(w) deo (6.27) 


Similarly, the power spectral density S,(Q) of a discrete-time random process X(n) is defined as the 
Fourier transform of Ry(k): 


SQ) = y R,lkje~ (6.28) 


k=- 0 


Thus, taking the inverse Fourier transform of S,{Q), we obtain 


Rx(k) = = i, Sx(Qei™ dQ (6.29) 

Properties of SQ): 
1. S,(Q + 22) = $,(Q) (6.30) 
2. Sy(Q) is real and S,(Q) = 0. (6.31) 
3. Sy(—Q) = $,(Q) (6.32) 
4. ELX?(n)] = R,(0) = - [ Sy(Q) dQ (6.33) 


Note that property 1 [Eq. (6.30)] follows from the fact that e~/* is periodic with period 2x. Hence it 
is sufficient to define $,(Q) only in the range (—z, 2). 
D. Cross Power Spectral Densities: 


The cross power spectral density (or cross power spectrum) Syy(@) of two continuous-time random 
processes X(t) and Y(t) is defined as the Fourier transform of Ryy(z): 


Syy(@) = [’ Ryy(t)e~ 1" dt (6.34) 


“ow 


Thus, taking the inverse Fourier transform of S,y(w), we get 


Ry,(t) = _ [° Sry(aei** deo (6.35) 


aw 


Properties of Sy): 


Unlike S,(@), which is a real-valued function of w, Sy;(@), in general, is a complex-valued func- 
tion. 


1. Syy(@) = Syx(—@) (6.36) 
2. Sxy(—@) = S¥y(@) (6.37) 


Similarly, the cross power spectral density S,)(Q) of two discrete-time random processes X(n) and 
Y(n) is defined as the Fourier transform of Ryy(k): 


SxQ) =) Ryy(kje ™ (6.38) 
k=-@ 
Thus, taking the inverse Fourier transform of Syy(Q), we get 


Ryy(k) = = { Syy(Qe™ dO (6.39) 
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Properties of Syy(Q): 


Unlike S,{Q), which is a real-valued function of @, Sy,(Q), in general, is a complex-valued func- 
tion. 


1. Sy(Q + 2m) = Sy (O) (6.40) 
2. Syy(O) = Syx(—O) (6.41) 
3. Sy —Q) = S#,(O) (6.42) 


6.4 WHITE NOISE 


A continuous-time white noise process, W(t), is a WSS zero-mean continuous-time random 
process whose autocorrelation function is given by 


Ry(t) = 676(t) (6.43) 
where 6(t) is a unit impulse function (or Dirac 6 function) defined by 
| (t)P(t) dt = $(0) (6.44) 
where ¢(t) is any function continuous at t = 0. Taking the Fourier transform of Eq. (6.43), we obtain 
Sy(w) = 0? | O(t)e °° dt = a? (6.45) 
which indicates that X(t) has a constant power spectral density (hence the name white noise). Note 
that the average power of W(t) is not finite. 


Similarly, a WSS zero-mean discrete-time random process W(n) is called a discrete-time white noise 
if its autocorrelation function is given by 


Ry{k) = 76(k) (6.46) 
where 6(k) is a unit impulse sequence (or unit sample sequence) defined by 
1 k=0 
k)= . 
o(k) ‘0 k #0 (6.47) 
Taking the Fourier transform of Eq. (6.46), we obtain 
Sy{Q) = 0? Y S(kjye7 M* = o? —n<Q<n (6.48) 
k=—-o@ 


Again the power spectral density of W(n) is a constant. Note that $,(Q + 2x) = S,(Q) and the 
average power of W(n) is 0? = Var[W(n)], which is finite. 


65 RESPONSE OF LINEAR SYSTEMS TO RANDOM INPUTS 
A. Linear Systems: 


A system is a mathematical model of a physical process that relates the input (or excitation) 
signal x to the output (or response) signal y. Then the system is viewed as a transformation (or 
mapping) of x into y. This transformation is represented by the operator T as (Fig. 6-1) 


y=Tx (6.49) 


If x and y are continuous-time signals, then the system is called a continuous-time system, and if x 
and y are discrete-time signals, then the system is called a discrete-time system. If the operator T is a 
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x y 
System 
T 


Fig. 6-1 


linear operator satisfying 
T{x, + x2} = Tx, + Tx, =y, + y2 (Additivity) 
T{ax} = aTx = ay (Homogeneity) 


where « is a scalar number, then the system represented by T is called a linear system. A system is 
called time-invariant if a time shift in the input signal causes the same time shift in the output signal. 
Thus, for a continuous-time system, 


T {x(t — to)} = y(t — to) 
for any value of ty, and for a discrete-time system, 
T{x(n — no}} = y(n — Ao) 


for any integer ny. For a continuous-time linear time-invariant (LTI) system, Eq. (6.49) can be 
expressed as 


y(t) = [ h(A)x(t — a) da (6.50) 
where h(t) = T{d(t} (6.51) 
is known as the impulse response of a continuous-time LTI system. The right-hand side of Eq. (6.50) is 
commonly called the convolution integral of h(t) and x(t), denoted by A(t) * x(t). For a discrete-time 
LTI system, Eq. (6.49) can be expressed as 


yn) = hx(n — i) (6.52) 


where h(n) = T{d(n)} (6.53) 


is known as the impulse response (or unit sample response) of a discrete-time LTI system. The right- 
hand side of Eq. (6.52) is commonly called the convolution sum of h(n) and x(n), denoted by h(n) « x(n). 


B. Response of a Continuous-Time Linear System to Random Input: 


When the input to a continuous-time linear system represented by Eq. (6.49) is a random process 
{X(0), t € T,}, then the output will also be a random process { Y(t), t € T,}; that is, 


T{X(0, t € T,} = {Y(, t € T,} (6.54) 
For any input sample function x,(t), the corresponding output sample function is 
yt) = T{x(0)} (6.55) 


If the system is LTI, then by Eq. (6.50), we can write 


Y(t) = { ” WAX (t-— a) da (6.56) 


Note that Eq. (6.56) is a stochastic integral. Then 


a 


ELY()] = | A(A)ELX(t — A)] aA (6.57) 
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The autocorrelation function of Y(t) is given by (Prob. 6.24) 


Ry(t, s) = | { h(a)A(B)R x(t — a, s — B) da dp (6.58) 
If the input X(z) is WSS, then from Eq. (6.57), 


ELY()] = uy [° (A) dd = py H(0) (6.59) 


where H(0) = H(w)|,,-9 and H(w) is the frequency response of the system defined by the Fourier 
transform of h(t); that is, 


H(@) = [; h(t)e 4°" dt (6.60) 
The autocorrelation function of Y(t) is, from Eq. (6.58), 
Ry(t, 8) = i [7 mompreus —t+a-— 8) da dB (6.61) 
Setting s = t + t, we get 
Rt, t+ 7) = [’ [’ h(a)h(B)R y(t + 0 — B) da dB = Ry(z) (6.62) 


From Eqs. (6.59) and (6.62), we see that the output Y(t) 1s also WSS. Taking the Fourier transform of 
Eq. (6.62), the power spectra! density of Y(t) is given by (Prob. 6.25) 


Sy(@) = [° Ry(t)e7 * dt = | H(w) |?Sy(@) (6.63) 


Thus, we obtain the important result that the power spectral density of the output is the product of the 
power spectral density of the input and the magnitude squared of the frequency response of the system. 

When the autocorrelation function of the output Ry(t) is desired, it is easier to determine the power 
spectral density Sy(w) and then evaluate the inverse Fourier transform (Prob. 6.26). Thus, 


R(t) = — [" Shoei do = -. [’ | H(w) |?Sy(w)e* deo (6.64) 


By Eq. (6.15), the average power in the output Y(t) is 


wo 


E[Y*(1)] = Ry(0) = - { | H(w) |’Sx(@) deo (6.65) 


C. Response of a Discrete-Time Linear System to Random Input: 


When the input to a discrete-time LTI system is a discrete-time random process X(n), then by Eq. 
(6.52), the output Y(n) is 


¥(n) = y h(i)X(n — i) (6.66) 


i=-@ 


The autocorrelation function of Y(n) is given by 


Ryn, m) = y y A(A(DR y(n — i, m— 1 (6.67) 


i=—-% l=—« 
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When X(n) is WSS, then from Eq. (6.66), 


ELY(n)] = px D) Al) = py H(0) (6.68) 


t= — 00 


where H(0) = HQ){g-, and H(Q) is the frequency response of the system defined by the Fourier 
transform of h(n): 


HQ) = y h(nje~ 4" (6.69) 
The autocorrelation function of Y(n) is, from Eq. (6.67), 
Ry(n, m) = » > ADAOR (Mm —n +i -) (6.70) 
Setting m =n + k, we get 
Ryn +h= >. »> HDKDR lk + i — D = Ry(k) (6.71) 


From Eas. (6.68) and (6.71), we see that the output Y(n) is also WSS. Taking the Fourier transform of 
Eq. (6.71), the power spectral density of Y(n) is given by (Prob. 6.28) 


SQ) = | H(Q)|?SQ) (6.72) 
which is the same as Eq. (6.63). 


6.6 FOURIER SERIES AND KARHUNEN-LOEVE EXPANSIONS 
A. Stochastic Periodicity: 
A continuous-time random process X(t) is said to be m.s. periodic with period T if 
E{[X(t + T) — X(t)]?} =0 (6.73) 


If X(t) is WSS, then X(t) is m.s. periodic if and only if its autocorrelation function is periodic with 
period T; that is, 


R(t + T) = Ry(1) (6.74) 


B. Fourier Series: 


Let X(t) be a WSS random process with periodic R(t) having period T. Expanding R,(t) into a 
Fourier series, we obtain 


a 


Ryxa)= Y cer @ = 2n/T (6.75) 
1 (7 . 
where C= | Ry(t)e7 "2% dt (6.76) 
0 
Let X(t) be expressed as 
X()= YX, eine Wo = 2n/T (6.77) 


n= — 0 
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where X,, are r.v.’s given by 
1 ft ; 
X,=3 | X(te 7° dt (6.78) 
T Jo 
Note that, in general, X, are complex-valued r.v.’s. For complex-valued r.v.’s, the correlation between 


two r.v.’s X and Y is defined by E(X Y*). Then X(t) is called the m.s. Fourier series of X(t) such that 
(Prob. 6.34) 


E{|X() — X@)|?} =0 (6.79) 
Furthermore, we have (Prob. 6.33) 
Ux n=0 
E xX = = . 
(X,,) = Hx O(n) \p n#0 (6.80) 
Cc n=m 
E(X, X*) =c,6(n —m)=4" . 
(X, X75) = C,0(n — m) {¢ nem (6.81) 
C. Karhunen-Loéve Expansion 
Consider a random process X(t) which is not periodic. Let X(t) be expressed as 
X()= ¥X,¢,(t)) O<t<T (6.82) 
na=1 
where a set of functions {@,(t)} is orthonormal on an interval (0, T) such that 
T 
| PAlt)\Pm(t) dt = d(n — m) (6.83) 
10 
and X,, are r.v.’s given by 
T 
Xx, = { X (tor (t) at (6.84) 
0 


Then X(t) is called the Karhunen-Loéve expansion of X(t) such that (Prob. 6.38) 
E{| X(t) — X()/?} =0 (6.85) 


Let R(t, s) be the autocorrelation function of X(t), and consider the following integral equation: 


{ "Rylt 3)$,(s) ds =4,¢,(t) O<t<T (6.86) 


where A, and @,(t) are called the eigenvalues and the corresponding eigenfunctions of the integral 
equation (6.86). It is known from the theory of integral equations that if R{t, s) is continuous, then 
@,(t) of Eq. (6.86) are orthonormal as in Eq. (6.83), and they satisfy the following identity: 


Ryltss) = YA 6(00809 (6.87) 


which is known as Mercer’s theorem. 
With the above results, we can show that Eq. (6.85) is satisfied and the coefficient X, are orthog- 
onal r.v.’s (Prob. 6.37); that is, 


An n=m 


0 n¥m (6.88) 


E(X, X*) = 4,6(n — m) = : 
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6.7 FOURIER TRANSFORM OF RANDOM PROCESSES 
A. Continuous-Time Random Processes: 
The Fourier transform of a continuous-time random process X(t) is a random process X(w) given 
by 
X(o) = { X(the 4 dt (6.89) 


which is the stochastic integral, and the integral is interpreted as an m.s. limit; that is, 


£{] 2 - | ° X(te~/” dt ‘| =0 (6.90) 


— oO 


Note that X(qw) is a complex random process. Similarly, the inverse Fourier transform 
1(°. . 
X(t) =~ | X(w)el dw (6.91) 
20 J- 


is also a stochastic integral and should also be interpreted in the m.s. sense. The properties of 
continuous-time Fourier transforms (Appendix B) also hold for random processes (or random 
signals). For instance, if Y(t) is the output of a continuous-time LTI system with input X(t), then 


¥(w) = X(w)H(w) (6.92) 


where H(w) is the frequency response of the system. 
Let Ry(w,, @2) be the two-dimensiona! Fourier transform of R y(t, s); that is, 


Ry(@,, @2) = [; [ Rx(t, sje Ho +929 de ds (6.93) 
Then the autocorrelation function of X¥(w) is given by (Prob. 6.41) 
Rx(@,, @2) = E[X(w,)X*(w2)] = Rylw,, —@2) (6.94) 
If X(t) is real, then 
E[X(w,)X(@2)] = Ry(@,, @2) (6.95) 
X(—a) = X*(w) (6.96) 
Ry(—w@,, —@,) = R¢(a,, @2) (6.97) 


If X(t) is a WSS random process with autocorrelation function Rx(t, s) = Rx(t — s) = Ry{t) and power 
spectral density S,(w), then (Prob. 6.42) 


Ry(@,, @2) = 2nS(w;)5(w, + 2) (6.98) 
Ry(@,, W2) = 2nSy(w,)d(w, — W) (6.99) 


Equation (6.99) shows that the Fourier transform of a WSS random process is nonstationary white 
noise. 


B. Discrete-Time Random Processes: 
The Fourier transform of a discrete-time random process X(n) is a random process X(Q) given by 
(in m.s. sense) 


XQ) = y X(nje~ (6.100) 


a 
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Similarly, the inverse Fourier transform 
1 (", ; 
X(n) = on | X(Q)e" dO (6.101) 
should also be interpreted in the m.s. sense. Note that X(Q + 2x) = X(Q) and the properties of 


discrete-time Fourier transforms (Appendix B) also hold for discrete-time random signals. For 
instance, if Y(n) is the output of a discrete-time LTI system with input X(n), then 


¥(Q) = X(Q)H(Q) (6.102) 


where H(Q) is the frequency response of the system. 
Let R,(Q,, Q,) be the two-dimensional Fourier transform of R,(n, m): 


R,(Q,, Q,) = y x Ry(n, me” Hn + Oz (6.103) 
Then the autocorrelation function of X(Q) is given by (Prob. 6.44) 
Rx(Q,, Q2) = E[X(Q,)X*Q,)) = R,(Q,, —Q)) (6.104) 
If X(n) is a WSS random process with autocorrelation function R,(n, m) = Ry(n — m) = R,(k) and 
power spectral density $,(Q), then 
Ry(Qy, Qa) = 2AS,4(Q,)HQ, + 2) (6.105) 
R(Q,, Q2) = 2AS(Q1)H(Q, — Q3) (6.106) 


Equation (6.106) shows that the Fourier transform of a discrete-time WSS random process is nonsta- 
tionary white noise. 


Solved Problems 


CONTINUITY, DIFFERENTIATION, INTEGRATION 
6.1. Show that the random process X(t) is m.s. continuous if and only if its autocorrelation function 

Rx{t, s) is continuous. 

We can write 
E{{X(t + 6) — X(t)]?} = ELX2(t + 2) — 2X(t + &)X(t) + X%(0)] 
=R,(t +e, t+ 6) — 2Ry(t + € t) + Ryft, 0) (6.107) 
Thus, if R(t, s) is continuous, then 
lim E{[X(t + e) — X(1)]?} = lim {Ry(t + 6, t+ 6) — 2Ry(t + 6, + Rylt, )} =0 


270 6-0 
and X(t) is m.s. continuous. Next, consider 
Rylt +), 6+ €2) — Ryx(t, ) = ELL X(t + €,) — XIE X(t + €2) — X(t)]} 
+ EX[X(t + €:) — X(N]X()} + E{LX(t + @2) — X(N] X(0} 
Applying Cauchy-Schwarz inequality (3.97) (Prob. 3.35), we obtain 
Ryft + €1, 0 + €2) — Rylt, 2) < (EEX (t + €,) — X(OPJE{LX(t + £2) — X(QV°})'? 
+ (E{LX(t + €,) — X(N SELX7)))? + (E{LX(t + €2) — X(OPJELX*(H)1)? 
Thus if X(t) is m.s. continuous, then by Eq. (6./) we have 
lim Ry(t + €,,¢+ €,)— Rylt, p= 9 


€1,£270 


that is, Ry(¢, s) is continuous. This completes the proof. 
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6.2. 


6.3. 


6.4. 


6.5. 


6.6. 
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Show that a WSS random process X(t) is m.s. continuous if and only if its autocorrelation 
function R(t) is continous at t = 0. 


If X(t) is WSS, then Eq. (6.107) becomes 
E{LX(t + 6) — X()]?} = 2[Rx(0) — Rxle)] (6.108) 
Thus if R,(t) is continuous at t = 0, that is, 


lim [R y(e) — Ry(0)] = 0 


6-0 


then tim E{LX(t + «) — X(t)]7} = 0 


e-0 


that is, X(t) is m.s. continuous. Similarly, we can show that if X(t) is m.s. continuous, then by Eq. (6.108), 
R,(t) is continuous at t = 0. 


Show that if X(z) is m.s. continuous, then its mean is continuous; that is, 


lim wx(t + €) = py(t) 


670 
We have 
Var[X(t + e} — X(t)] = E{LXt + 6) — X()P} — {EL X(e + 8) — X(H)]}? 20 
Thus E([X(t + e) — X(t))?} > {ELX(t + 2) — XW}? = Lege + 8) — wy)? 


If X(t) is m.s. continuous, then as ¢ — 0, the left-hand side of the above expression approaches zero. Thus 
lim [uxt + e)— y(t) =0 — or lim [y(t + €) = p(t) 
Show that the Wiener process X(t) is m.s. continuous. 
From Eq. (5.64), the autocorrelation function of the Wiener process X(t) is given by 
R,{t, s) = 6? min(t, s) 
Thus, we have 
|Ry(t + €,, £ + €2) — Ry(t, | = 67| min(t + €,, £+€,)~ t| < 6? max(e,, €,) 


Since lim max(e,, €,) =0 
cy. £270 


R,{t, s) is continuous. Hence the Wiener process X(t) is m.s. continuous. 


Show that every m.s. continuous random process is continuous in probability. 
A random process X(t) is continuous in probability if, for every t and a > 0 (see Prob. 5.62), 


lim P{| X(t + e) — X(t)| > a} =0 
c-0 
Applying Chebyshev inequality (2.97) (Prob. 2.37), we have 
— ALIX +8) - XP) 


= a 


PE|X(t +e) — X()| > a} 


Now, if X(t) is m.s. continuous, then the right-hand side goes to 0 as e > 0, which implies that the left-hand 
side must also go to 0 as +0, Thus, we have proved that if X(t) is m.s. continuous, then it is also 
continuous in probability. 


Show that a random process X(t) has a m.s. derivative X(t) if 7R,(t, s)/At ds exists at s = t. 
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6.7. 


Let Y(t; 6) = Xt 8) = X +6) = X(t) 


By the Cauchy criterion (see the note at the end of this solution), the m.s. derivative X'(t) exists if 


lim E{{Y¥(t: €) — ¥(t; €,)]?} =0 


£).6270 
Now E{(Y(t; €,) — Y(t; e,)}?} = ELY (ts €2) — 2¥(t; e,)¥ (ts €,) + ¥%(ts €,)) 
= ELY*(t; €2)] — 2EL Y(t; €,) V(t; €,)) + EL Y(t: €,)] 


and ELY(s 2) Y(t e,]] = (LX + 9) — KONA +e) — X00} 


| 
=— [Rylt + €,, 0+ 8,) — Rylt + €,, 0 — Rylt, t+ €,) + Rylt, OH] 


Ee 
1 fRylt + ey, t+ e))— Rylt + 62,0) Rylt, t+ 6) — Relt, 9) 
Bp ey fy 
G7 Ry(t, s 
Thus lim ELY(e; 6) ¥(t; €:)] = CAMS =R, 
£1.6270 CUCS  |yat 


provided é?R,(t, s)/@t ds exists at s = ¢. Setting e, = e, in Eq. (6.112), we get 


lim EL Y(t; €,)] = lim ELY%t; €,)] = Rp 


6,70 £270 
and by Eq. (6.//1), we obtain 
lim E{L Y(t; 6.) — Y(t; €,)]?} = R, -2R, +R, =0 


£3,€270 
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(6.109) 


(6.110) 


(6.111) 


(6.112) 


(6.113) 


Thus, we conclude that X(t) has a m.s. derivative X‘(t) if d?R,(t, s/t Os exists at s = ¢. If X(t) is WSS, then 


the above conclusion is equivalent to the existence of 0?R,(1)/??1 at t = 0. 


Note: In real analysis, a function g(e) of some parameter ¢ converges to a finite value if 


lim [g(e,) — g(e,)] = 9 


&), 8270 


This is known as the Cauchy criterion. 


Suppose a random process X(t) has a ms. derivative X(t). 


(a) Find ELX(t)]. 
(b) Find the cross-correlation function of X(t) and X’(£). 
(c) Find the autocorrelation function of X‘(t). 


(a) We have 


E[X"()] = ef tim Mee 


c70 é 
= lim E 
end 

Ij Hy(t + €) — wy(t) 
= lim 
é 


c70 


€ 


E +e) — x0) 


= py(t) 


(6.114) 
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(b) From Eq. (6.17), the cross-correlation function of X(t) and X(t) is 


Ryxlt, 8) = ELX()X(8)] = e x10 Lim. Mera ae) 


cg 
_ ELX(t)X(s + e)] — ELX(t)X(s)) 
= lim 
c70 
= jim Rel 8+ 8) — Rats 8) _ OR alt, 5) (6.115) 
20 é Os 
(c) Using Eq. (6.115), the autocorrelation function of X(t) is 
RyAt, s) = ELX'()X(s)] = z)| Lim. MEI M9} 
£70 
_ ELX(t + @)X(s)] — ELX()X'(s)] 
= ko 
é70 é 
= lim Ryxlt + & 58) — Ryxlt, 5) 
270 é 
OR yx At, 8) _ a°R x(t, 8) 
at Bt s (6.116) 
If X(t) is a WSS random process and has a m.s. derivative X’(t), then show that 
d 
(a) Ryxt) = 7 Rett) (6.117) 
T 
@ 
(6) Ry(t) = — ie R,(t) (6.118) 


(a) For a WSS process X(t), Ry(t, s) = Ry(s — t). Thus, setting s —¢ = in Eq. (6.115) of Prob. 6.7, we 
obtain éR,(s — 1)/ds = dR,(t)/dt and 


dR 
Ryx(t, +t) = Ryylt) = ERX) 
dt 
(b) Now OR,y(s — )/dt = —dR,(t)/dt. Thus, d7Ry(s — t)/at ds = —d?Ry(t)/dt?, and by Eq. (6.116) of Prob. 
6.7, we have 
2 
Ryle, ¢+ 1) = Ret) = ~ 75 Rule) 


Show that the Wiener process X(t) does not have a m.s. derivative. 


From Eq. (5.64), the autocorrelation function of the Wiener process X(t) is given by 


2 


o*s t>s 
R(t, s) = 67 min(i, s) = \ les 
fi) o t>s 
T —R =o7u(t — s) = . 
hus a Rah s)=Out—S=40 (6.119) 


where u(t — s) is a unit step function defined by 


1 t>s 
W-~ I= 10 es 


and it is not continuous at s = (Fig. 6-2). Thus d7Rx(t, s)/Ot ds does not exist at s = t, and the Wiener 
process X(t) does not have a m.s. derivative. 
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6.10. 


6.11. 


u(t — $s) 


0 s t 
Fig. 6-2 Shifted unit step function. 


Note that although a m.s. derivative does not exist for the Wiener process, we can define a generalized 
derivative of the Wiener process (see Prob. 6.20). 


Show that if X(t) is a normal random process for which the m.s. derivative X’(t) exists, then X’(t) 
is also a normal random process. 


Let X(t) be a normal random process. Now consider 


X(t +8) — X(t) 
€ 


Y(t) = 


Then, n r.v.’s Y(t,), ¥.(t),..-, Y,(t,) are given by a linear transformation of the jointly normal r.v.’s X(t,), 
X(t, +8), X(t2), X(t. + 8), ..., X(t,), X(t, + 6). It then follows by the result of Prob. 5.60 that Y,(t,), ¥,(t2), 
..., ¥(t,) are jointly normal r.v.’s, and hence Y(t) is a normal random process. Thus, we conclude that the 
m.s, derivative X’(t), which is the limit of ¥(t) as e— 0, is also a normal random process, since m.s. con- 
vergence implies convergence in probability (see Prob. 6.5). 


Show that the m.s. integral of a random process X(t) exists if the following integral exists: 


{ [Rut B) da dp 


Ams. integral of X(t) is defined by [Eq. (6.8)] 


Y() = [x da = Lim. Y X(t) At; 


0 an-o i 


Again using the Cauchy criterion, the m.s. integral Y(t) of X(¢) exists if 


2 
lim ely X(t) At, — ¥ X(t) sa } =0 (6.120) 
i k 


At, Atk 0 


As in the case of the m.s. derivative [Eq. (6.//7/)], expanding the square, we obtain 
2 
ely X(t) At; — Y X(t) as | } 
i k 


= A Y X(e)X(t,) At; At, + YY X(t)X() Aty At, — 2 ¥ Y X()X(t,) At, an 
iok ik isk 
= VY Ryelt;, ) At Ay + YY Rylt;, &) At; A — 2 Y Rylty, ty) At; Ag, 
ik ik isk 
and Eq. (6.120) holds if 


lim YY Rylt;, &) At; At 


AG. AU?O fk 
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exists, or, equivalently, 


[ [eve B) da dB 


exists. 


6.12. Let X(t) be the Wiener process with parameter o7. Let 


(a) 
(b) 
(a) 


Y(j) = [x@ da 


0 
Find the mean and the variance of Y(t). 
Find the autocorrelation function of Y(t). 


By assumption 3 of the Wiener process (Sec. 5.7), that is, E[X(t)] = 0, we have 


ELY()] = ef [x a] = [etx da =0 
0 0 


Then Var[Y(0)) = E[Y2(1)] = [ [eeoxion da dB 
O 40 


t t 
= | | Ry(a, B) da dB 
0 V0 
By Eq. (5.64), Ry(a, 8) = a7 min(a, 8); thus, referring to Fig. 6-3, we obtain 


Var[Y(t)] = 0? [ [ min(a, B) da dB 
oO 0 


ff a ¢ a e273 
=o? [ap ['ada+o* [de ['p ap =~ 22 
lo 0 ) 0 3 


Y(t} = [xe da + [oc — X(s)] da + (t — s)X(s) 


Let > s > O and write 


= Y(s)+ [Ux — X(s)] da + (t — s)X(s) 


Then, fort > s > 0, 


Ry(t, 8) = ELY(Y(S)] 
= ECY%s)] + | E{CX(a) — X(s)}¥(9)} da + (t — JELX(9V5)] 


‘Ss 


[CHAP 6 


(6.121) 


(6.122) 


(6.123) 
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Now by Eq. (6.122), 


2.3 


ELY%3)] = Var ¥(9)] = 
Using assumptions 1, 3, and 4 of the Wiener process (Sec. 5.7), and since s < a < t, we have 


[etx — X(s)] ¥(s)} da = [z[o@ — X(s)] [xo ap} da 
0 


Ss Ss 


= [ [a0 — X(s)|LX(B) — X(0)]} 4B da 
is 0 


= [ [sexe — X(s)JELX(s) — X(0)] dB da = 0 


0 


Finally, for0 < B <s, 


(t — s)ELX(s) Y(s)] = (t — 5) | E[X(s)X(B)] 4B 
i) 
= (1-5) [a B) dB =(t — s) iG min(s, B) 4B 
i) i) 


2 Ss s? 
=o7(t—s) | BdB=o%(t—s) = 
bo 2 


Substituting these results into Eq. (6.123), we get 


2.3 2 1 
Ry(t, s) = > +o%(t —s) . = 5 07s%(3t ~5) 


Since Ry(t, s) = Ry(s, t), we obtain 


ho*s*(3t—s) t>s2>0 
gor°v3s—t) =s>t>0 


Ry(t, s) = { 


POWER SPECTRAL DENSITY 
6.13. Verify Eqs. (6.13) and (6.14). 


From Eq. (6.12), 
Ryx(t) = ELX()X(t + 1)] 


Setting ¢ + t = s, we get 


Ry(t) = ELX(s — 1)X(s)] = ELX(s)X(s — t)] = Rx{-—1) 


Next, we have 


E{(X(t) + X(t + t)]?} 20 


Expanding the square, we have 


or 


E[X2(t) + 2X()X(t +t) + X20 + 2)] 0 
ELX?(t)] 4 2ELX()X(t + t)] + ELX7(2 + 1] = 0 


Thus 2R,(0) + 2Rx(t) = 0 


from which we obtain Eq. (6.14); that is, 


Rx) > | Ry(t)| 


6.14. Verify Eqs. (6.18) to (6.20). 
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(6.124) 
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By Eq. (6.17), 
Ry—1) = ELX(N Y(t ~ 2] 
Setting t — t = s, we get 
Ryy(—t) = ELX(s + t)¥(s)] = ELY(s)X(s + 2)] = Ry x(t) 
Next, from the Cauchy-Schwarz inequality, Eq. (3.97) (Prob. 3.35), it follows that 
{ELX(NY(e + 2]}? < ELXCELY2(¢ + 2] 
or [Rxy(z)]? < Ry(0)Ry(0) 
from which we obtain Eq. (6./9); that is, 
|Ryxt)| < \/Rx(O)RY(O) 
Now E{[X(t) — Y(t + t)}?} > 0 
Expanding the square, we have 
E[X2(t) — 2X(HY(t + 0) + ¥%t4+ 7] =0 
or E(X7()] — 2ELX()¥(t + 2)] + ELY2(t + 7)] = 0 
Thus Rx(0) — 2Ryxy(t) + Ry(0) > 0 
from which we obtain Eq. (6.20); that is, 
Ryy(t) S$ 3[RxO) + Ry(0)] 


6.15. Two random processes X(t) and Y(t) are given by 
X(t) =A cos(at+@) Y(t) =A sin(wt + ©) 


where A and m are constants and © is a uniform r.v. over (0, 2x), Find the cross-correlation 
function of X(t) and Y(t) and verify Eq. (6.18), 


From Eq. (6.17), the cross-correlation function of X(t) and Y(t) is 
Ryylt, f+ 2) = ELX(HY(t + 2)] 
= E{A? cos(@t + ©) sinfa(t + 1) + OF} 
Ae 
=> E{sin(2@t + wt + 2©) — sin(—rt)] 
Ae 
=F sin wr = Ryy(2) (6.125) 
Similarly, 
Ryy(t, ¢ + t) = ELY(e)X(t + 2)) 
= E{A? sin(wt + ©) cosfa(t + 1) + OJ} 
At 
=> E[sin(2wt + @t + 2©) + sin(—or)] 


Az 
=> sin wt = Ry x(t) (6.126) 


From Eqs. (6.125) and (6,126), we see that 


At A2 
Ryyx—1) = > sin @(—t) = — > sin @t = Ry y(t) 


which verifies Eq. (6.18). 


6.16. Show that the power spectrum of a (real) random process X(t) is real and verify Eq. (6.26). 
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6.17. 


From Eq. (6.23) and expanding the exponential, we have 


S,(@) = [" Ry{r)e~#"* dt 


foe $) 


It 


{ Ry(tXcos @t — j sin wt) dt 


-2 


i] 


~~ 


{ Ry(t) cos wt dt —j { R(t) sin wt dt (6.127) 


Since Ry(—t) = R,(t), Rx(t} cos wt is an even function of t and Ry(z) sin wt is an odd function of t, and 
hence the imaginary term in Eq. (6.127) vanishes and we obtain 


S,(@) = { R(t) cos wt dt (6.128) 
which indicates that S,(q) is real, Since cos(— @t) = cos(w7), it follows that 
Sy(-- @) =S,(@) 


which indicates that the power spectrum of a real random process X(t) is an even function of frequency. 


Consider the random process 
Y(t) =(- 


where X(t) is a Poisson process with rate A. Thus Y(t) starts at Y(O) = 1 and switches back and 
forth from +1 to —1 at random Poisson times T;, as shown in Fig. 6-4. The process Y(t) is 
known as the semirandom telegraph signal because its initial value Y(0) = 1 is not random. 


(a) Find the mean of Y(t). 
(b) Find the autocorrelation function of Y(t). 


(a) We have 


Y(t) = 1 if X(t) is even 
-1 if X(t) is odd 


Thus, using Eq. (5.55), we have 


PLY(t) = 1) = PLX(s) = even integer] 
(at)? 


_ elt + ot | =e" cosh At 


PLY(t) = —1) = PLX(d) = odd integer] 
(at)? 


_ eat + 7 + | =e ** sinh At 


Fig. 6-4 Semirandom telegraph signal. 
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Hence By(t) = ELY(t)] = (PLY () = 1] +(-UPLY() = —1] 
=e “(cosh dt — sinh dt) = e724" (6.129) 
(b) Similarly, since Y(t)¥(t+ 7) = 1 if there are an even number of events in (f, t+} for t>0O and 
Y(e)¥(t + t) = —1 if there are an odd number of events, then fort > Oand t+ 71> 0, 
Ry(t, ¢ +7) = ELY()Y¥(e + 1)] 
_ jy (AT)" ~ ay (At)" 
= 1 Av ~] yes 
rT eT 
a —) na . . 
= ony! a = ev Ate 7 Ar = ett 
which indicates that Ry(f, 1 + t) = Ry(t), and by Eq. (6.13), 
Ry{t) = e241" (6.130) 


Note that since E[ Y(t)] is not a constant, Y(t) is not WSS. 


Consider the random process 
Z(t) = AY(t) 


where Y(t) is the semirandom telegraph signa! of Prob. 6.17 and A ts a r.v. independent of Y(t) 
and takes on the values +1! with equal probability. The process Z(f) is known as the random 
telegraph signal. 


(a) Show that Z(t) is WSS. 
(b) Find the power spectral density of Z(t). 
(a) Since E(A) = 0 and E(A?) = 1, the mean of Z(2) is 
Bt) = E[Z(t)] = EAVELY(9] = 0 (6.131) 
and the autocorrelation of Z(t) is 
Rot, t+ 2) = ELA* Ys) ¥(t + 2)] = E(AX%EL YOY + 2)] = Rylt, ¢ + 7) 
Thus, using Eq. (6./30), we obtain 
RAt, t+ 1) = Rt) = et (6.132) 


Thus, we see that Z(t) is WSS. 


(b) Taking the Fourier transform of Eq. (6.132) (see Appendix B), we see that the power spectrum of Z(t) is 
given by 


(6.133) 


Slo) = —4 
A= rae 


Let X(t) and Y(t) be both zero-mean and WSS random processes. Consider the random process 
Z(t) defined by 


Z(t) = X(t) + Y(t) 
(a) Determine the autocorrelation function and the power spectral density of Z(t), (i) if X(t) and 
Y(t) are jointly WSS; (ii) if X(t) and Y(t) are orthogonal. 


(b) Show that if X(t) and Y(t) are orthogonal, then the mean square of Z(t) is equal to the sum 
of the mean squares of X(t) and Y(t). 


(a) The autocorrelation of Z(t) is given by 
Ralt, 8) = ELZ()Z(s)] = E{LX() + Y(OICX(s) + Y(s)]} 
= EL X()X(s)] + ELX(QY(s)] + ELY(OX(s)] + ELY()Y(s)] 
= Ry(t, s) + Ryylt, s) + Ryy(t. s) + Rt, 5) 
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(i) If X(e) and Y(t) are jointly WSS, then we have 
Rot) = Ry(t) + Ryy(t) + Ryy(t) + Ry(t) 
where t = s — ¢. Taking the Fourier transform of the above expression, we obtain 
S7(w) = Sx{w) + Syylw) + Sym) + Sy) 
(ii) If X(t) and Y(t) are orthogonal [Eq. (6.2/)], 
Ryy(t) = Ryx{t) = 0 
Then R,(t) = Ry{t) + Ry(t) (6.134a) 
S2(w) = S,(w) + S(w) (6.1345) 
(b) Setting t = 0 in Eq. (6.134a), and using Eq. (6.15), we get 
E[Z*(0)] = ELX7(t)] + ELY7(0)] 


which indicates that the mean square of Z(t) is equal to the sum of the mean squares of X(t) and Y(t). 


WHITE NOISE 


6.20. Using the notion of generalized derivative, show that the generalized derivative X(t} of the 
Wiener process X(t) is a white noise. 


From Eq. (5.64), 
R,(t, s) = 0? min(t, s) 


and from Eq. (6.119) (Prob. 6.9), we have 


, 
— Rylt, s) = oult — 8) (6.135) 
Os 


Now, using the 6 function, the generalized derivative of a unit step function u(r) is given by 
d 
— u(t) = A(t 
aa ) = (2) 


Applying the above relation to Eq. (6.135), we obtain 


2 


6 
—__ =o7 — —s= 07d — 
as R,(t, s) =o at u(t — s) = o°d(t — s) (6.136) 


which is, by Eq. (6.116) (Prob. 6.7), the autocorrelation function of the generalized derivative X‘(t) of the 
Wiener process X(t); that is, 
RyAt, s) = 6d(t — s) = 6 25(t) (6.137) 


where t =! — s. Thus, by definition (6.43), we see that the generalized derivative X‘(t) of the Wiener process 
X(t) is a white noise. 

Recall that the Wiener process is a normal process and its derivative is also normal (see Prob. 6.10). 
Hence, the generalized derivative X‘(t) of the Wiener process is called white normal (or white gaussian) noise. 


6.21. Let X(t) be a Poisson process with rate 4. Let 
Y(t) = X(t) — at 
Show that the generalized derivative Y’(t) of Y(t) is a white noise. 
Since Y(t) = X(1) — At, we have formally 
Y()= X(t) —A (6.138) 
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Then ELY()) = ELX'(t) — A] = ELX()] — A (6.139) 


Ry At, s) = ELY(t)¥(s)] = E(LX') — ANLX(s) — AJ} 
= EL X'(t)X(s) — AX(s) — AX(t) + a7] 
= ELX()X(s)] — AELX(s)] — AELX’(0)] + 2? (6.140) 
Now, from Eqs. (5.56) and (5.60), we have 


EL X(t)] = at 
Ry(t, s) = A min(t, s) + A?¢s 
Thus ELX'()] =A and E[X'(s)] =A (6.141) 
and from Eggs. (6.7) and (6.137), 
ELX(t)X"(s)] = Ryft, s) = Thala = Ad(t — s) + A? (6.142) 
Substituting Eq. (6.141) into Eq. (6.139), we obtain 
ELY(1)] =0 (6.143) 
Substituting Eqs. (6.147) and (6.142) into Eq. (6.140), we get 
Ryt, s) = Ad(t — s) (6.144) 


Hence we see that Y(t) is a zero-mean WSS random process, and by definition (6.43), Y(t) is a white noise 
with a? = J. The process Y(t) is known as the Poisson white noise. 


Let X(t) be a white normal noise. Let 
Yo = [xe da 
19) 
(a) Find the autocorrelation function of Y(t). 
(b) Show that Y(0 is the Wiener process. 
(a) From Eq. (6.137) of Prob. 6.20, 
R,(t, s) = a76(t — s) 


Thus, by Eq, (6.11), the autocorrelation function of Y(t) is 


Ry(t, 8) = [ [Rut B) dB da 
0 YO 


= [ ac — B) da dB 
0 Jo 


=o? [iu — p) dp 
0 
min(#, s) 

=@ | dp = a* min(t, s) (6.145) 
10) 


(b) Comparing Eq. (6.145) and Eq. (5.64), we see that Y(t) has the same autocorrelation function as the 
Wiener process. In addition, Y(t) is normal, since X(t) is a normal process and Y(0) = 0. Thus, we conclude 
that Y(t) is the Wiener process. 


Let Y(n) = X(n) + W(n), where X(n) = A (for all n) and A is a r.v. with zero mean and variance 
67, and W(n) is a discrete-time white noise with average power a”. It is also assumed that X(n) 
and W(n) are independent. 


(a) Show that Y(n) is WSS. 
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(b) Find the power spectral density SQ) of Y(n). 


(a) The mean of Y(n) is 
ELY(n)] = E[X(n)] + E[W(n)] = E(A) + ELW(n)] = 


The autocorrelation function of Y(n) is 


Ry(n, n +k) = E{(X(n) + Win) [X(n +k) +: Win t+ KY} 
= aes n)X(n + k)] + ELX(n YE(Win + kK] + ELW(n) J ELX(n + k)] + ETLWin)Win + A] 


(A?) + Ry(k) = 04? + 075(k) = Ry(k) (6.146) 


Thus Y(n) is WSS. 
(b) Taking the Fourier transform of Eq. (6.146), we obtain 


SQ) = 2x0 ,75(Q) + 0? —a<Q<n (6.147) 


RESPONSE OF LINEAR SYSTEMS TO RANDOM INPUTS 


6.24. Derive Eq. (6.58). 
Using Eq. (6.56), we have 
Ry(t, s) = EL Y(t) ¥(s)] 


= ll h(a) X(t — 0) da [ h(B)X(s — B) is] 


” nla)h(B)ECX(t — a) X(s — B)] do dB 


ae )A(B)R y(t — a, s — B) da dp 


6.25. Derive Eq. (6.63). 
From Eq. (6.62), we have 


= {- { h(a)h(B)R y(t + « — B) da dB 


Taking the Fourier transform of Ry(t), we obtain 


So) = [Fr y(t)e 2° dr = [" [’ [" h(a)h(B)R y(t + a — Ble /* da dB dx 
Letting t + « — B = A, we get 
So) = | : I ; j * Hah(p)Ry Ae" da dB da 


_ [’ h(oc)ei2 aa {” h(Bye~ sop ap |” R,{Aje esa qi] 


= H(—o)H(w)S (a) 
= H*(a)H(a)S(@) = | H(@) |? Sy(o) 


6.26. A WSS random process X(t) with autocorrelation function 
R,(t) =e"! 
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where a is a real positive constant, is applied to the input of an LTI system with impulse 
response 


h(t) = e~ u(t) 


where b is a real positive constant. Find the autocorrelation function of the output Y(t) of the 
system. 


The frequency response H(w) of the system is 


1 
A(@) = F[h(t)] “jo4b 


The power spectral density of X(¢) is 
2a 
S,(@) = F[Rx(1)] = aaa 


By Eq. (6.63), the power spectral density of Y(¢) is 
1 2a 
=|H(w)|? = ( ———._ || ———~ 
Sw) = | H(@)|*Sx(@) (als) 


_ a 2b b 2a 
~ (a? — b*)b \w? +b?) (a? ~ b*)b \w? + a? 


Taking the inverse Fourier transform of both sides of the above equation, we obtain 


1 
(a? — b*)b 


Ry(t) = (ae~ Plt! — be attl 


6.27. Verify Eq. (6.25), that is, the power spectral density of any WSS process X(t) is real and S,(@) > 0. 


The realness of S,(@) was shown in Prob. 6.16. Consider an ideal bandpass filter with frequency 
response (Fig. 6-5) 
1 @, <|a@| <a, 
H(o) = ' 
(o) 10 otherwise 
with a random process X(¢) as its input. 
From Eq. (6.63), it follows that the power spectral density Sy(w) of the output Y(¢) equals 


S,(@) o,<|o|<@, 


Sy(@) = | 


0 otherwise 


Hence, from Eq. (6.27), we have 


x @2 


1 
Sw) dw = 2 — { S,(@) dw > 0 
2n 


@1 


1 
ELY*()] = > { 


-o2 


which indicates that the area of $,(@) in any interval of @ is nonnegative. This is possible only if S,(w) > 0 
for every w. 


Hw) 


CHAP. 6] 


6.28. 


Verify Eq. (6.72). 
From Eq. (6.71), we have 


Rk) = y y ADA(DR AK + i— 


t=-a l=-a@ 


Taking the Fourier transform of Ry(k), we obtain 


a wo po 


SIM = Y RAKE = FY YY ADMD (Kk +i — Dem 


k= am k= - me i= —w l= Hm 


Letting k +i ~—/ =n, we get 


SQ) 


n=—-wo i=-oc l=-m@ 


A(—Q)HQ)S (Q) 


= H*(Q)H(Q)S f{Q) = | H(Q)|? SQ) 
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yo FY AAR y (me 7-H 


») nide™ FY nde! SY Rene ™ 


i= 0 (tS - a i 
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6.29. The discrete-time system shown in Fig. 6-6 consists of one unit delay element and one scalar 
multiplier (a < 1). The input X(n) is discrete-time white noise with average power a”. Find the 


spectra] density and average power of the output Y(n). 


X(n) 


From Fig. 6-6, Y(n) and X(n) are related by 
Y(n) = aY(n — 1) + X(n) 
The impulse response h(n) of the system is defined by 
h(n) = ah(n — 1) + d(n) 
Solving Eq. (6.149), we obtain 
A(n) = a"u(n) 


where u(n) is the unit step sequence defined by 


un) = 1 n>0 
7=)0 n<0 


Taking the Fourier transform of Eq. (6.150), we obtain 


1 


= AoW in 
H(Q) y ae loa 


a=0 


=ja 


a<l1,|Q]<x 


(6.148) 


(6.149) 


(6.150) 
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Now, by Eq. (6.48), 
$,(Q) = o? {[Qlh<2 
and by Eq. (6.72), the power spectral density of Y(n) is 
Sy(Q) = | H(Q))?S,(Q) = H(Q)A(—Q)S,(O) 


oc? 


~ (1 — ae ®\1 = ae) 
oc 


~ 1 +a? —2acosQ 


[Q] <x 


Taking the inverse Fourier transform of Eq. (6./51), we obtain 


o2 
t-— a? 


Ry(k) = alk} 


Thus, by Eq. (6.33), the average power of Y(n) is 


ELY(m)] = Ry(0) = —"— 
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(6.151) 


Let Y(t) be the output of an LTI system with impulse response A(t), when X(t) is applied as input. 


Show that 
(a) Rus(t, 5) = [ W(B)Ru(t, 8 — B) AB 
(b) Ry(t, s) = [’ h(a)Ry y(t — %, 8) da 


(a) Using Eq. (6.56), we have 


Ryylt, 8) = ELX()Y(s)] = Af xin | h(B)X(s — B) as] 


- [’ MBIELX(X(s ~ BY] a8 = | “ ABRxMt, s— B) dp 


(6) Similarly, 


Ry(t, s) = ELY()¥(s)] = rf | ° h(a) X(t — a) do v0 | 


= [’ h(a@)ELX(t — a) Y(s)] da = [" h(a@)Ryy(t — a, s) da 


oe 


(6.152) 


(6.153) 


Let Y(t) be the output of an LT] system with impulse response A(t) when a WSS random process 


X(t) is applied as input. Show that 


(a) Syy(@) = A(@)Sy{@) 
{b) Sy(@) = H*(@)Syy(w) 
(a) If X(t) is WSS, then Eq. (6.152) of Prob. 6.30 becomes 


Ryy(t, 8) = | h(B)Rx(s — 1 — B) dB 


which indicates that Rxy(t, s) is a function of the time difference t = s — ¢ only. Hence 


Ry) = | ” MB)Ry(e — B) ap 


(6.154) 
(6.155) 


(6.156) 


(6.157) 
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Taking the Fourier transform of Eq. (6.157), we obtain 


Sy@) = { Ry fale! dt = [" [" h(B)R y(t — Ble F*" dB dt 


= [’ [’ h(B)Ry(Aje" 2A +) dB da 


= [" (Bye 4°? dB [" Ry(Aje 4? da = H(a)S,(@) 


(6) Similarly, if X(2) is WSS, then by Eq. (6.156), Eq. (6.153) becomes 
Ry{t, s) = [’ h(a)Ryy(s — t + a) da 
which indicates that Ry{t, s) is a function of the time difference t = s — t only. Hence 
Ry) = { * Ha)Ryyl(t + a) da 


Taking the Fourier transform R,(t), we obtain 


Sy(@) = [" Ry(te 4"! dt = [" [" h(a)R At + ae 4°" da dz 


= { { h(a)Ryy(Ae~24- da dd 


= [- h(ael?* da [’ RyyAje7** da 


3 —a@ 


= H(—0)S yo) = H*(o)Syy(@) 
Note that from Eqs. (6.154) and (6.155), we obtain Eq. (6.63); that is, 
Sy(w) = H*(w)S yw) = H*(w)H(w)S,(@) = | A(@) |?Sy(w) 
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(6.158) 


6.32. Consider a WSS process X(t) with autocorrelation function R,(t) and power spectral density 


S,(@). Let X’(t) = dX(t)/dt. Show that 


d 
(a) Rxyy(t)= a R(t) 


d? 
(b) Ryle) = —F5 Ralo) 


(c) Sy{w) = w7S,(w) 


(6.159) 


(6.160) 


(6.161) 


(a) If X(t) is the input to a differentiator, then its output is Y(t) = X(t). The frequency response of a 


differentiator is known as H(w) = jw. Then from Eq. (6.154), 
Syx(@) = H(w)S,(@) = joS,(a) 


Taking the inverse Fourier transform of both sides, we obtain 


d 
Ryx(t) = a R,{z) 


(6) From Eq. (6.155), 
Sx{@) = H*(@)Syx(@) = —joSy xo) 
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Again taking the inverse Fourier transform of both sides and using the result of part (a), we have 


Ry {t) = d R = d? 
x(t) = “Ot xx(t) = ae 


Ry{t) 
(c) From Eq. (6.63), 
Sy(@) = | H(@)|?Sy(@) = | jo |?Sy(@) = @7Sy(@) 


Note that Eqs. (6.159) and (6.160) were proved in Prob. 6.8 by a different method. 


FOURIER SERIES AND KARHUNEN-LOEVE EXPANSIONS 
6.33. Verify Eqs. (6.80) and (6.81). 
From Eq. (6.78), 


1 ft . 
X,== { X(the 720% dt @y = 2n/T 
T Jo 


Since X(t) is WSS, ELX(t)] = wy. and we have 


: 
E(X,) = { E[X(t)]e ie" dt 
oo 


re 
= hs [ eI dt = py &(n) 


Again using Eq. (6.78), we have 


? 


1? 
E(X, X*) = ex, ~ { X*(s)eimmos as| 
T Jo 


1 [7 . 
== { E[X, X*(s)]e7""s ds 
T Jo 


] 


Now E{X, X*(s)] = e| { X (pe! dt xr 
0] 
1? 
== { EL X()X*(s)Je7 ee" dt 
T Jo 
1 ft ; 
=— { Ry(t — sje" de 
0 


T 


Letting  — s = 1, and using Eq. (6.76), we obtain 
i . 
EX, X*s)] = = { Ryltje MOD de 


io 


1 ft . . . 
= \ { Rylr)e ior irherion = Cy oe” inwas (6.162) 
‘0 


1 (7? . ; 
Thus E(X, x* = 7 { Ce Inwospimaos ig 
0 


1 fT, 
{ ge Ha~meos ds = c, 5(n — m) 


=O, 
0 


T 


6.34. Let X(t) be the Fourier series representation of X(t) shown in Eq. (6.77). Verify Eq. (6.79). 
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From Eq. (6.77), we have 


} 
= e{[ xo _— s X, coe | xy _ 5 xterm | 


=E[|X(NP]— YO ELX*X(n]e7*e™ 


n= 


Y E[X, X*(t)]e"?% + y Y ELX,, X*]ei"~ meor 


n=- 0 n= -% maw 


Now, by Eas. (6.81!) and (6.162), we have 


E(x — Xo} = Ef [x10 - 5 X, enor 


E[X#X(t)] = cheinvo! 
ELX, X*(1)] = c,e-"eo" 
E(X,, Xm) = Cy O(n — m) 


Using these results, finally we obtain 


E((X()~ XP} =RYWO- Y ck-— Yat Yo, =0 


n=) n= oor n=-— 4 


since each sum above equals R,(0) [see Eq. (6.75)]. 


6.35, Let X(t) be m.s. periodic and represented by the Fourier series [Eq. (6.77) ] 


X(t) = > Xe yy = 2M/To 


na mw 


Show that 
EC} X(t) |? -> BUX, \?) (6.163) 


From Eq. (6.8/), we have 
E(|X,|?) = E(X, XP) =, (6.164) 
Setting t = 0 in Eq. (6.75), we obtain 


EL|X(?]=R(0)= Yoc,= YL EX!) 


n=- a. n=-aQ 


Equation (6./63) is known as Parseval's theorem for the Fourier series. 


6.36. Ifarandom process X(t) is represented by a Karhunen-Loéve expansion [Eq. (6.82)] 
X(t) = YX. O<t<T 
and X,,’s are orthogonal, show that ¢,(t) must satisfy integral equation (6.86); that is, 
[Rut s)d,(s) ds = A, ,(t} O<1<T 
Consider 


X(QXF = y Xa Xn P(t) 
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6.38. 


ANALYSIS AND PROCESSING OF RANDOM PROCESSES 


Then ELX()X¥] = Yo EX XPDO a(t) = EC X17 VG,(0) 
m=! 
since X,’s are orthogonal; that is, E(X,, X*) = 0 ifm 4 n. But by Eq. (6.84), 


E[X()X7] = ef x0 [wo is| 
0 


T 
= [ ELX()X *(s)1¢,(s) ds 
0 
= [Rute s}P,(8) ds 
0 
Thus, equating Eqs. (6.165) and (6.166), we obtain 


T 
{ Rylt, s)h,(8) ds = E(| X,1")P,(t) = An bolt) 
oO 


where A, = E(| X,|*). 


Let X(t) be the Karhunen-Loéve expansion of X(t) shown in Eq. (6.82). Verify Eq. (6.88). 


From Eqs. (6.166) and (6.86), we have 


T 
E[X()X¥] = [ Ry(t, 8),,(8) ds = Ay Palt) 
10. 


Now by Eqs. (6.83), (6.84), and (6.167) we obtain 
T 


T 
E(X,, X7) = dl { X(pr(t) dt x3] -| E(X(HX RI Om(o) dt 
oO 


0 
T T 

= [ Am Pmlt)ba(t) dt = Any [ P(NPr(t) at 
0 0 

= A,,d(m — n) = A, d(n — m) 


Let X(t) be the Karhunen-Loéve expansion of X(t) shown in Eq. (6.82). Verify Eq. (6.85). 


From Eq. (6.82), we have 


E(| X() — X()|?] = e{| xin - 5 X,, b(t) 


; 


- ef) x _y x00 | [ x0 =) xzoro} 
n=l n=1 


= ELI XP] — Y ELKO XAOH0 


— Y ELX*OX,14,00 + YY EX ADS, (OKO 


Using Eqs. (6.167) and (6.168), we have 


ELI X() — X77 = Rx 9 — LAGOON ~ LY AGMIGMO + YA, balOGMO 


=0 
since by Mercer’s theorem [Eq. (6.87)] 


Ret = A, A000) 
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(6.165) 


(6.166) 


(6.167) 


(6.168) 
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6.39, 


and A, = E(|X,|?) = a¥ 


Find the Karhunen-Loéve expansion of the Wiener process X(t). 


From Eq. (5.64), 
o*s s<t 


R,(t, s) = «7 min(t, s) = 
alt, 8) (9) ot  s>t 


Substituting the above expression into Eq. (6.86), we obtain 


T 
o { min(t, s)d,(s) ds = A, @,() 0<1<T (6.169) 
lo 


t T 
or o | so,(s) ds + o7t | o,(s) ds = A, o,(t) (6.170) 
0 ft 
Differentiating Eq. (6.170) with respect to ft, we get 
T 
a? | ons) ds = A, p,(0) (6.171) 
Differentiating Eq. (6.171) with respect to ¢ again, we obtain 
o* 
Halt) + = $,(0) = 0 (6.172) 


A general solution of Eq. (6.172) is 
$,({t) = a, sin @,t +b, cos w,t oO, = of! /, 


In order to determine the values of a,, b,, and A, (or w,), we need appropriate boundary conditions, From 
Eq. (6.170), we see that $,(0) = 0. This implies that 5, = 0. From Eq. (6.171), we see that $)(T) = 0. This 
implies that 


o (QQn—1)x (n—4)n 


= = =1,2,... 
Oni, OF T " 
Therefore the eigenvalues are given by 
oT? 
= =1,2,... 6.173 
G-pe" (6179) 
The normalization requirement (Eq. (6.83)] implies that 
r a? T 2 
i yed=+—=1 = f= 
[ (a, sin w, t) 5 >a, T 


Thus, the eigenfunctions are given by 


$,(t) = fe sin(n - :) z t O<t<T (6.174) 


and the Karhunen-Loéve expansion of the Wiener process X(t) is 


22 l\ xz 
= {o ; “ja 6.175 
X(t) [REx sn(n AEE O<t<T ( ) 


xX,= fe [xe sin(n - ;) z t 


and they are uncorrelated with variance A, . 


where X,, are given by 
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6.40. Find the Karhunen-Loéve expansion of the white normal (or white gaussian) noise W(t). 
From Eq. (6.43), 
Ry(t, s) = o7d(t — s) 


Substituting the above expression into Eq. (6.86), we obtain 
7 
a | d(t — s)b,(s) ds = A, b,(t) O<t<T 
0 


or [by Eq. (6.44)] 
a7,(t) = Ay b,lt) (6.176) 


and @,({t) are arbitrary. Thus, any complete orthogonal set {,(¢)} with 
? can be used in the Karhunen-Loéve expansion of the white gaussian 


which indicates that all 4, = 0? 


corresponding eigenvalues 2, = o 
noise. 


FOURIER TRANSFORM OF RANDOM PROCESSES 
6.41. Derive Eq. (6.94). 
From Eq. (6.89), 


X(@,) = [ X(Ne~ 2" dt Xw,) = {" X (sje! ds 
Then Rzleo,, 2) = ELX(w,)X*(w,)] = f f r X(t)X*(sje- Hew e29 at as| 


-| | “ ELX()X*(s)Je Aen de ds 
7 [ | Rylt, sje Hottest dr ds = Ry(w,, —@,) 


in view of Eq. (6.93). 


6.42. Derive Eqs. (6.98) and (6.99). 


Since X(t) is WSS, by Eq. (6.93), and letting  — s = t, we have 


a ou 
Rw, @)) = | | Ryt — sje Hots) dt ds 


w 
= | Ry(t)e~ 2°" dt | gHovters gs 
_ 


= S,(w,) | e Horr ers ds 
From the Fourier transform pair (Appendix B) | < 27é(w), we have 


| e J dt = 2nd(w) 
Hence Ryo, @>) = 2n8S y(@,)d(w, + )) 
Next, from Eq. (6.94) and the above result, we obtain 


Ryx@,, @2) = Ry(o,, —@ ) = 20S y(@1)d(@, — w2) 
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6.43. 


6.44. 


Let X(w) be the Fourier transform of a random process X(t). If X(w) is a white noise with zero 
mean and autocorrelation function q(w,)é(w, — w,), then show that X(t) is WSS with power 
spectral density q(w)/27. 


By Eq. (6.91), 
1{* . ; 
X(t) = — | X(w)er da 
2n J- 


a 


Then ELX(0] = = | . ELX (w]e do = 0 (6.177) 


came 2) 


Assuming that X(t) is a complex random process, we have 


Rxlt, s) = ELX(OX*(s)] = ef as -{ {- X(w,)X*(w elo) deo, de, 
= 1 “ * E X(w )X*( Hart — 25) d d 
=F | [X( (w,)]e wo, dw, 


1 f* [* 
=_n | | gw )d(@, — Je" 2) daa, de, 


uw 
} * . 
=— | qa, je" da, (6.178) 


which depends only on t — s = t. Hence, we conclude that X(t) is WSS. Setting f— s = 7 and @, = w in Eq. 
(6.178), we have 


1 (* l . 
=-— JOT d -— Jor d 
R,(z) fe aje (7) on “|e q(@ o) a 


=: 
in view of Eq. (6.24). Thus, we obtain Sela) = q(w)/2n. 
Verify Eq. (6.104). 
By Eq. (6.100), 
XQ) = ¥ Xe X*Q,) = YO X* me 
Then RQ, Q,) = ELX(Q,)X*(Q,)] = y y ELX(n)X*(m) Je Ju 2am) 


a= awO ma- % 


= x > Ry(n, mye Pin + - Qaim) = RY(Q,, —Q,) 


in view of Eq. (6.103). 


6.45. Derive Eqs. (6.105) and (6.106). 


If X(n) is WSS, then Ry{n, m) = R,(n — m). By Eq. (6.103), and letting n — m = k, we have 
RlQy,2)= YoY Ral — ime “Ha am 


= y Ry(k)e7 x eg Hr + Qa)m 


k= - 0 m>--a. 


ra 
Q,) Y @ 7 HE + C2) 


m=- 4 
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6.46. 


6.47. 


6.48. 


6.49. 


6.50. 
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From the Fourier transform pair (Appendix B) x(n) = 1 226(Q), we have 


Ye Fr 499 = 2nG(Q, + Q3) 


Hence R(Q,, Q2) = 2nS,(Q,)5(Q, + Q2) 
Next, from Eq. (6.704) and the above result, we obtain 
Rx(Qy,, Q2) = RQ, —Q)) = 2nS$,(Q,)h(Q, — Q)) 


Supplementary Problems 


Is the Poisson process X(t) m.s. continuous? 
Hint: Use Eq. (5.60) and proceed as in Prob. 6.4. 
Ans. Yes. 


Let X(t) be defined by (Prob. 5.4) 
X(t) = Y cos wt t>0 


where Y is a uniform r.v. over (0, 1) and @ is a constant. 


(a) Is X(t) ms. continuous? 
(b) Does X(t) have a mss. derivative? 


Hint: Use Eq. (5.87) of Prob, 5.12. 
Ans. (a) Yes; (b) yes. 


Let Z(t) be the random telegraph signal of Prob. 6.18. 


(a) Is Z(t) ms. continuous? 
(b) Does Z(t) have a ms. derivative? 


Hint: Use Eq. (6.132) of Prob. 6.18. 
Ans. (a) Yes; (b) no. 


Let X(t) be a WSS random process, and let X(t) be its m.s. derivative. Show that ELX(s)X'(t)] = 0. 


Hint: Use Eqs. (6.13) [or (6.14)] and (6.117). 


2 + T/2 
Let Z(t) = 7 | X(a) da 


where X(t) is given by Prob. 6.47 with w = 2n/T. 


(a) Find the mean of Z(t). 
(b) Find the autocorrelation function of Z(t). 


1, 
Ans. (a) ~—-— sin at 
nt 


4 
(6) R(t, s) = —> sin we sin ws 
3n 
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651. 


6.52. 


6.53. 


6.54. 


6.55. 


6.56. 


Consider a WSS random process X(t) with E[X(t)] = py. Let 


I T2 
cxinre= st | X(0) dt 


—Ti2 


The process X(t) is said to be ergodic in the mean if 


Lim. <X(t)>p = ELX(t)] = ny 


Toa 
Find E[<X(t)>7]. 
Ans. [ly 


Let X(t) = A cos(w yt + ©), where A and w, are constants, © is a uniform r.v. over (—x, x) (Prob. 5.20), 
Find the power spectral density of X(t). 


A’n 
Ans. S,(@) = > [d(w — wo) + d(@ + wy)] 


A random process Y(t) is defined by 
Y(t) = AX(t) cos(w, t + ©) 
where A and w, are constants, © is a uniform r.v. over (—z, 2), and X(t) is a zero-mean WSS random 


process with the autocorrelation function R,(t) and the power spectral density S,(w). Furthermore, X(¢) 
and © are independent. Show that Y(t) is WSS, and find the power spectral density of Y(t). 


2 
Ans. Sy{w) = 7 [S,(@ — w,) + Sy(w + @,)] 


Consider a discrete-time random process defined by 
X(n) = ¥ a; cos(Q,n + ©) 
i=l 
where a, and Q), are real constants and ©, are independent uniform r.v.’s over (—7, 7). 


(a) Find the mean of X(n). 
(b) Find the autocorrelation function of X(n). 


Ans, (a) E[X(n)] =0 


(b) Ry(n,n +k) = - 


a? cos(Q;k) 
=1 


Consider a discrete-time WSS random process X(n) with the autocorrelation function 
R,(k) = 10e7 °-5!*! 
Find the power spectral density of X(n). 


6.32 
Ans. Si = FRc SF 


Let X(t) and Y(t) be defined by 
X(t) = U cos wot + V sin wot 
Y(t) = V cos wt — U sin wot 
where jw, is constant and U and V are independent r.v.’s both having zero mean and variance o. 
(a) Find the cross-correlation function of X(t) and Y(t). 
(b) Find the cross power spectral density of X(t) and Y(¢). 
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6.57. 


6.58. 


6.59. 


6.60. 


6.61. 


6.62. 


6.63. 
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Ans. (a) Ryy(t, +7) = ~ 67 sin wot 
(b) Syy(w) — ja?n[d(w — wo) — Sw + wo)] 


Verify Eqs. (6.36) and (6.37). 

Hint: Substitute Eq. (6./8) into Eq. (6.34). 

Let Y(t) = X(t) + W(t), where X(t) and W(t) are orthogonal! and W(t) is a white noise specified by Eq. (6.43) 
or (6.45). Find the autocorrelation function of Y(t). 


Ans. Rt, s) = Ry(t, s) + 76(t — s) 


A zero-mean WSS random process X(t) is called band-limited white noise if its spectral density is given by 


on |w| < ws 


AY = 
HO) = 4 |w| > wy 
Find the autocorrelation function of X(t). 


No Wz, SIN WgTt 
Ans. Ry(t) = 2? — 
2% Wat 


A WSS random process X(t) is applied to the input of an LTI system with impulse response A(t) = 3e7 *u(t). 
Find the mean value of Y(t) of the system if ELX(t)] = 2. 

Hint: Use Eq. (6.59). 

Ans. 3 

The input X(t) to the RC filter shown in Fig. 6-7 is a white noise specified by Eq. (6.45). Find the mean- 
square value of Y(t). 

Hint: Use Eqs. (6.64) and (6.65). 

Ans. o°/(2RC) 


X(t) Cc 140) 


Fig. 6-7 RC filter. 


The input X(z) to a differentiator is the random telegraph signal of Prob. 6.18. 


(a) Determine the power spectral density of the differentiator output. 
(b) Find the mean-square value of the differentiator output. 
4h? 
w? + 422 
(b) EL Y*()\] = @ 


Ans. (a) Sy(a) = 


Suppose that the input to the filter shown in Fig. 6-8 is a white noise specified by Eq. (6.45). Find the power 
spectral density of Y(t). 


Ans. Sy(@) = o7(1 + a? + 2a cos wT) 
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6.64, 


6.65. 


6.66. 


6.67. 


6.68. 


Fig. 6-8 


Verify Eq. (6.67). 
Hint: Proceed as in Prob. 6.24. 


Suppose that the input to the discrete-time filter shown in Fig. 6-9 is a discrete-time white noise with 
average power o*. Find the power spectral density of Y(n). 


Ans. SQ) = o7(1 + a? + 2a cos Q) 


X(n) ¥(n) 


Fig. 6-9 


Using the Karhunen-Loéve expansion of the Wiener process, obtain the Karhunen-Loéve expansion of the 
white normal noise. 


Hint: Take the derivative of Eq. (6.175) of Prob. 6.39. 


22 I 
Ans. [oS moon 3) O<1<T 


where W, are independent normal r.v.’s with the same variance o°. 
Let Y(t) = X(t) + W(t), where X(t} and W(t) are orthogonal and W(t) is a white noise specified by Eq. (6.43) 
or (6.45). Let 6,(t) be the eigenfunctions of the integral equation (6.86) and 4, the corresponding eigenvalues. 


(a) Show that @,(t) are also the eigenfunctions of the integral equation for the Karhunen-Loéve expansion 
of Y(t) with Ry(t, s). 


(b) Find the corresponding eigenvalues. 
Hint: Use the result of Prob. 6,58. 
Ans. (b) 4, +o? 


Suppose that 
X= xX," 


where X, are r.v.’s and @, is a constant. Find the Fourier transform of X(t). 


Ans. X(@) = ¥. 2nX,, (@ — nor) 
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6.69. Let ¥(w) be the Fourier transform of a continuous-time random process X(t). Find the mean of X(a). 


foo) 


Ans. F[py(t)) = | By(t)e 3°" dt where y(t) = EC X(2)] 


im) 


6.70. Let 
KQ)= PY X(nye 
where ELX(n)] = 0 and E[X(n)X(k)] = ¢,? 6(n — k). Find the mean and the autocorrelation function of 
X(Q). 


Ans. ELX(Q)] =0 RQ, 0.) = FY o,2e7H-2)" 


n=— om 


Chapter 7 


Estimation Theory 


7.1 INTRODUCTION 


In this chapter, we present a classical estimation theory. There are two basic types of estimation 
problems. In the first type, we are interested in estimating the parameters of one or more r.v.’s, and in 
the second type, we are interested in estimating the value of an inaccessible r.v. Y in terms of the 
observation of an accessible r.v. X. 


7.2 PARAMETER ESTIMATION 


Let X be a r.v. with pdf f(x) and X,,..., X, a set of m independent r.v.’s each with pdf f(x). The 
set of r.v.’s (X,, ..., X,) is called a random sample (or sample vector) of size n of X. Any real-valued 
function of a random sample s(X,,..., X,) is called a statistic. 

Let X be ar.v. with pdf f(x; 6) which depends on an unknown parameter @. Let (X,,..., X,) bea 
random sample of X. In this case, the joint pdf of X,,..., X,, is given by 


L(x; 8) =f (xq, ---, X35 9) = [] fx; 9) (7.1) 
i=t 
where x,,..., x, are the values of the observed data taken from the random sample. 
An estimator of @ is any statistic s(X,,..., X,), denoted as 
© = (X,,..., X,) (7.2) 
For a particular set of observations X,; = x,,..., X, = X,, the value of the estimator s(x,, ..., x,) will 


be called an estimate of @ and denoted by 6. Thus an estimator is a r.v. and an estimate is a particular 
realization of it. It is not necessary that an estimate of a parameter be one single value; instead, the 
estimate could be a range of values. Estimates which specify a single value are called point estimates, 
and estimates which specify a range of values are called interval estimates. 


7.3 PROPERTIES OF POINT ESTIMATORS 
A. Unbiased Estimators: 
An estimator © = s(X,,..., X,) is said to be an unbiased estimator of the parameter 6 if 
E(®) = 0 (7.3) 
for all possible values of 6. If © is an unbiased estimator, then its mean square error is given by 
E[(© — 6)’] = E{[© — E(©)]*} = Var(©) (7.4) 


That is, its mean square error equals its vanance. 


B. Efficient Estimators: 


An estimator ©, is said to be a more efficient estimator of the parameter @ than the estimator ©, 
if 
1. ©, and ©, are both unbiased estimators of 6. 
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2. Var(@,) < Var(@,). 


The estimator @y, = 9(X,,..., X,) is said to be a most efficient (or minimum variance) unbiased 
estimator of the parameter @ if 


1. Jt is an unbiased estimator of @. 
2. Var(Oyy) < Var(©) for all ©. 


C. Consistent Estimators: 


The estimator ©, of @ based on a random sample of size n is said to be consistent if for any small 
e> 0, 


lim P(|@, -—@|<e)=1 (7.5) 
or equivalently, 
lim P(|©®, — @|>e) =0 (7.6) 


The following two conditions are sufficient to define consistency (Prob, 7.5): 


1. lim E(®,) = 6 (7.7) 
2. lim Var(@,) = 0 (7.8) 


7.4 MAXIMUM-LIKELIHOOD ESTIMATION 


Let f(x; 0) =f(x,,..., x,3 9) denote the joint pmf of the r.v.’s X,,..., X, when they are discrete, 
and let it be their joint pdf when they are continuous. Let 


L(8) = f(x; 8) =f (x1, 00-5 Xs (7,9) 


Now L(6) represents the likelihood that the values x,, ..., x, will be observed when @ is the true value 
of the parameter. Thus L(@) is often referred to as the likelihood function of the random sample. Let 
Our = XX, .--, X,) be the maximizing value of L(@); that is, 


L(Oy,) = max L(0) (7.10) 
6 


Then the maximum-likelihood estimator of @ is 
Our = 8(X4,---, X,) (7.11) 


and @y¢; is the maximum-likelihood estimate of @. 

Since L(8) is a product of either pmf’s or pdfs, it will always be positive (for the range of possible 
value of 0). Thus In L(@) can always be defined, and in determining the maximizing value of 8, it is 
often useful to use the fact that L(@) and In L(@) have their maximum at the same value of @. Hence, 
we may also obtain 6,,, by maximizing In L(6). 


7.5 BAYES’ ESTIMATION 


Suppose that the unknown parameter @ is considered to be a r.v. having some fixed distribution 
or prior pdf f(@). Then f(x; @) is now viewed as a conditional pdf and written as f(x|@), and we can 
express the joint pdf of the random sample (X,,..., X,) and @ as 


F(X 1p es Xv OY HS (Ky, es XA OL() (7.12) 
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and the marginal pdf of the sample is given by 


Fie. 8) | fleas 4s 8) 48 (7.13) 
Reg 


where R, is the range of the possible value of @. The other conditional pdf, 


F(X, 155 Xps 4) SF (X1, «+05 XL OS(D) 
S(O\x,,...,%,) =o SS eet, 7.14 
x1 f(x, ---5 X,) SF (Xq, 006) Xp) (7.14) 
is referred to as the posterior pdf of 6. Thus the prior pdf {(6@) represents our information about @ 
prior to the observation of the outcomes of X,, ..., X,, and the posterior pdf f(@[x,, ..., x,) rep- 
resents our information about 6 after having observed the sample. 
The conditional mean of 6, defined by 


5 = E(O|X1, -.-, Xp) -| Of (B}x,, ...5 x,) dO (7.15) 
Re 


is called the Bayes’ estimate of 0, and 
©, = E(0|X,,..., X,) (7.16) 


is called the Bayes’ estimator of 0. 


7.6 MEAN SQUARE ESTIMATION 


In this section, we deal with the second type of estimation problem—that is, estimating the value 
of an inaccessible r.v. Y in terms of the observation of an accessible r.v. X. In general, the estimator Y 
of Y is given by a function of X, g(X). Then Y — Y = Y — g(X) is called the estimation error, and 
there is a cost associated with this error, CTY — g(X)]. We are interested in finding the function g(X) 
that minimizes this cost. When X and Y are continuous r.v.’s, the mean square (m.s.) error is often 
used as the cost function, 


CLY — 9({X)] = E{LY — 9 X)17} (7.17) 
It can be shown that the estimator of Y given by (Prob. 7.17), 
Y = o(X) = E(Y|X) (7.18) 


is the best estimator in the sense that the m.s. error defined by Eq. (7.17) is a minimum. 


7.7 LINEAR MEAN SQUARE ESTIMATION 


Now consider the estimator Y of Y given by 


Y¥Y =g(X)=aX +b (7.19) 
We would like to find the values of a and b such that the m.s. error defined by 
e=E[(Y — ¥)*] = E{LY — (aX + 6)]?} (7.20) 
is minimum. We maintain that a and 6 must be such that (Prob. 7.20) 
E{LY — (aX + b)]x} =0 (7.21) 
and a and b are given by 
Oxy Gy 


a=" 3 ox Pxy b = py — apy (7.22) 
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and the minimum m.s. error e,, is (Prob. 7.22) 
Cm = Oy (1 — pry’) (7.23) 


where oyy = Cov(X, Y) and pyy is the correlation coefficient of X and Y. Note that Eq. (7.2) states 
that the optimum linear ms. estimator Y = aX +b of Y is such that the estimation error Y — ¥ = Y — 
(aX + b) is orthogonal to the observation X. This is known as the orthogonality principle. The line 
y = ax + bis often called a regression line. 

Next, we consider the estimator Y of Y with a linear combination of the random sample 
(X,,.. +) X,) by 


Y= ¥a;,X; 


i 
t=1 


(7.24) 


Again, we maintain that in order to produce the linear estimator with the minimum m.s. error, the 
coefficients a; must be such that the following orthogonality conditions are satisfied (Prob. 7.35): 


a (y - Ba.x,)x,|=0 jel..,n (7.25) 
i=) 


Solving Eq. (7.25) for a,, we obtain 


a=R'b (7.26) 
where 
ay by Ry Ri, 
a=|: b=| - 6; = E(YX,) R=! : 7. : Ry = E(X;,X)) 
a, b,, Ry Run 
and R~! is the inverse of R. 
Solved Problems 


PROPERTIES OF POINT ESTIMATORS 


7.1. Let (X,,..., X,) be a random sample of X having unknown mean u. Show that the estimator of 
u defined by 


=- 5 x,=% (7.27) 


is an unbiased estimator of yu. Note that X is known as the sample mean (Prob. 4.64). 
By Eq. (4.108), 


12 12 | 
rm) = a2 Ext Saug=t Seals 


Thus, M is an unbiased estimator of y. 


7.2. Let (X,,..., X,) be a random sample of X having unknown mean yu and variance a”. Show that 
the estimator of a? defined by 


x (xX, — X? (7.28) 
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7.3. 


7.4, 


where X is the sample mean, is a biased estimator of a”. 
By definition, we have 


o* = E[(X; — p)] 


=l— 


Now E(S?) = el (x, - x] - a} FX, — w) — (8 wr} 
i=1 


ek 1Ms 


=i 


ll 
th 
-_—s 


((X; — w)? — 2X, — WX ~ wy + (X - wi} 


eae 


ll 

th 
-_-—s 
he 
iM 


(X; — pw)? — o(X — wh 


=) 


[(X; — y)?] — EL(X — p)?] = 0? - oy? 


ipa 
isi 


By Eqs. (4.112) and (7.27), we have 


joi 1 
2 2! ,_n-!1 , 
Thus E(S*) = 0* — a c= 1 o (7.29) 
which shows that S? is a biased estimator of a?. 
Let (X,, ..., X,) be a random sample of a Poisson r.v. X with unknown parameter 4. 


(a) Show that 


are both unbiased estimators of A. 
(b) Which estimator is more efficient? 
(a) By Eqs. (2.42) and (4.108), we have 
12 1 
E(A,)=— ¥ EX) =— (nd) =A 
n j=\ n 
E(A3) = 3LE(X,) + E(X2)] = 424) = A 
Thus, both estimators are unbiased estimators of A. 
(b) By Eas. (2.43) and (4.112), 


i 2 1 2 1 A 
Var(A,) = 5 2 Var(X) = 5 2, VarlX)) = = (nd) =~ 
i=1 i= 


A 
Var(A,) = 3(24) = 5 


Thus, ifn > 2, A, is a more efficient estimator of A than A2, since A/n < A/2. 
Let (X,,..., X,) be a random sample of X with mean yu and variance o”. A linear estimator of y 


is defined to be a linear function of X,, ..., X,, (X,, ..., X,). Show that the linear estimator 
defined by [Eq. (7.27)], 
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is the most efficient linear unbiased estimator of p. 


Assume that 
M, =((X,,..., X,) = Ya, X; 
fol 


is a linear unbiased estimator of » with lower variance than M. Since M, is unbiased, we must have 


E(M,) = Y.a,B(X) = 2) 


i=l i=] 


which implies that )7_, a; = 1. By Eq. (4.112), 


Ul 
a 
M 
2S 

{l 
.— 


! 
Var(M) = - o? and = Var(M,) = 6? Da? 
n 
By assumption, 
I if 
aya? <- 0? or Ya? <- (7.30) 
. n n 


Consider the sum 


c=) 
A 
we 
a 
& 
| 
sis 
— 
Ne 
il 


a 
.~ 

fey 

| 

nN 
a |S 
+ 
a)- 
Ld 
—— 


x 
1 


“2 
i 
2 ed 


ll 
upq= ths Me 
= 
| 
| 
mM 
2 
+ 
I 


which, by assumption (7.30), is less than 0. This is impossible unless a; = 1/n, implying that M is the most 
efficient linear unbiased estimator of y. 


7.5. Show that if 
lim E(©,) = 0 and lim Var(©,) = 0 
n> n= 
then the estimator ©, is consistent. 
Using Chebyshev’s inequality (2.97), we can write 


P(\O, — 8] >) < Ona HI _ 1 


é é E{(o, _ E(@,) + E(®,) 7 8]’} 


| 
pe 


E{[©, — E(@,))’ + [E(®,) — 6] + 2[0, — E(O,)][E(@,) — 67} 


% 


i] 


4 (Var(®,) + E{[E(@,) — 01°} + 2E{[O, — E(©,)][E(@,) — 4]}) 


Thus, if 
lim E(O,) = 8 and lim Var(©,) = 0 
then lim P(|©, — 8| > «) =0 


that is, ©, is consistent [see Eq. (7.6)]. 


7.6. Let (X,,..., X,) be a random sample of a uniform r.v. X over (0, a), where a is unknown. Show 
that 


A = max(X,, Xz, ..., X) (7.31) 


is a consistent estimator of the parameter a. 
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If X is uniformly distributed over (0, a), then from Eqs. (2.44), (2.45), and (4.98) of Prob. 4.30, the pdf of 
Z = max(X,, ..., X,) is 


a-i 

fala) = nfe(2)LF x(a)! =" (2) O<z<a (7,32) 

Thus E(A) = [p00 dz=— [ dz=—"—a 
0 a” Jo n+l 

and lim E(A) =a 
Next, E(A2) = [ene dz = - [eo dz = - ; 5a 

_ 2 2. na? nat 7 n 2 

Var(4) = E(4") — [E(A)}" = n+2 (n+lP (nt+20n+le— 

and lim Var(A) = 0 


n= 00 


Thus, by Eqs. (7.7) and (7.8), A is a consistent estimator of parameter a. 


MAXIMUM-LIKELIHOOD ESTIMATION 


7.7, Let (X,, ..., X,) be a random sample of a binomial r.v. X with parameters (m, p), where m is 
assumed to be known and p unknown. Determine the maximum-likelihood estimator of p. 


The likelihood function is given by [Eq. (2.36)] 
m 


L(p) =f (X15 +++) Xni P) = (’ pra = pyrene ("pra — pyr») 


= (") ae (inant _ pymn Eta xp 
xy Xn 


Taking the natural logarithm of the above expression, we get 


In L(p) = In ¢ +(x) in p+ (mn - 7m) In(t ~ p) 
i=l i=l 
where c= Il (") 


jan Xi 
and a Lp) =* Sx,- 4 (m— § x1) 
Setting d[In L(p)]/dp = 0, the maximum-likelihood estimate p,,, of p is obtained as 
or Put = - a (7.33) 


1 2 1. 
=— =— 7. 
Pur mn »» i=F x (7.34) 
7.8. Let (X,,..., X,) be a random sample of a Poisson r.v. with unknown parameter A. Determine the 


maximum-likelihood estimator of A. 
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7.9. 


7.10. 
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The likelihood function is given by [Eq. (2.40)] 
n ev 4x 7 MAgdi= 1% 
LA) = fly 8 A= T= 
Thus, In L(A) = —nd + Indy x,-—Ine 
i=1 
where c= [] (i!) 
i=1 
d d In L(A) = + y 
an qn Lia = —0 a 
Setting d[In L{A)|/ad = 0, the maximum-likelihood estimate A,,, of 2 is obtained as 
, 12 
Au == YX: (7.35) 
Nin) 
Hence, the maximum-likelihood estimator of A is given by 
1 2 > 
Aun =X X= = VX = ¥ (7.36) 
i=) 
Let (X,, ..., X,) be a random sample of an exponential r.v. X with unknown parameter A. 
Determine the maximum-likelihood estimator of A. 
The likelihood function is given by [Eq. (2.48)] 
LA) =f (%q, 205 Xq3 A) = [de = ate AEF 
ist 
Thus, In LA)=nIndA—-Ay x; 
i=1 
d n ° 
d — A)=-- i 
an 7 In L{A) 1 ue 
Setting d[In L{A)]/dA = 0, the maximum-likelihood estimate Ay, of A is obtained as 
. n 
Am. = (7.37) 
Lx 
i=1 
Hence, the maximum-likelihood estimator of A is given by 
n 1 
Aut = (Xy,---, X,) = ad (7.38) 
Lx, 
t=1 
Let (X,, ..., X,) be a random sample of a normal random r.v. X with unknown mean yu and 


unknown variance o”. Determine the maximum-likelihood estimators of p and o?. 


The likelihood function is given by [Eq. (2.52)] 


. 1 1 
L(x, o) =f(x, veey Xqy My O) = IT J2no exp - Ie? (x; - | 
Lo 


7 (ty 1 1 a 2 
“Nor - exp) — 32 Ls ~#) 
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Thus, In Lip, ¢) = — 5 In(2z) —n In o —= ao | x; ~ py? 
In order to find the values of p and o maximizing the above, we compute 


d 
ag Tt Me) = a 7 Lt 


é n 1 2 
a6 IO L(y, o) = - 24S Dew 


Equating these equations to zero, we get 


TMs 


(x; — Agi) = 9 
1 


1 n 
7 DL &; — fury -— 


Gun i=1 ous. 


and 


Solving for jy, and ¢y,, the maximum-likelihood estimates of » and o? are given, respectively, by 


y x (7.39) 
=1 


y (x; — fines)” (7.40) 


i 2 _ 

Mun = . Vy X,=X (7.41) 
12 _ 

Sx? == V(X; — XP (7.42) 


BAYES’ ESTIMATION 
TAL. Let (X,,..., X,) be the random sample of a Bernoullir.v. X with pmf given by (Eq. (2.32)] 
f(x; p) = p(t — py’ x=0,1 (7.43) 


where p, 0 <p <1, is unknown. Assume that p is a uniform r.v. over (0, 1). Find the Bayes’ 
estimator of p. 


The prior pdf of p is the uniform pdf; that is, 


S(p) = 1 O<p<!l 
The posterior pdf of p is given by 
S (X15 00+) Sas P) 
\Xq,.-., X,) = ——————_—**>+ 
a ES 


Then, by Eq. (7.12), 
I (Xp Xn P) HL (Xa os Xe TPS UP) 
= pEorn(y — pyr Bias * = pm — pyr" 
where m = 57_, x,, and by Eq. (7.13), 


i 
poead= [Pl Xe p) dp = {pr ~ pi a 
i] 


Now, from calculus, for integers m and k, we have 


mik! 


1 
m, _ k — 
fe (1 — py’ dp (m+k 41)! (7.44) 
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Thus, by Eq. (7.14), the posterior pdf of p is 


_ I, veny Xa P) (n + 1)! p™1 —py~™ 
S(P X15 0005 9) = (Sq X) mi(n —m)! 


and by Eqs. (7.15) and (7.44), 
L 
E(p|x,, teed Xp) = { Pf(p| x1, ceed X,) dp 
0 


+1)! f' 
= (n ) { pn _ py™ dp 
0 


mi(n — m)! 
— M+)! (m+ i)in— m)! 
~ m!(n — m)! (n + 2)! 


] Aa 
amtt_ (Sx+1) 


n+2° n+2\iG 


Hence, by Eq. (7.16), the Bayes’ estimator of p is 


Pe= BX yn X= (Fx, + 1) (7.45) 
i=] 


7.12. Let (X,, ..., X,) be a random sample of an exponential r.v. X with unknown parameter 2. 
Assume that / is itself to be an exponential r.v. with parameter a. Find the Bayes’ estimator of 4. 


The assumed pnor pdf of A is [Eq. (2.48)] 


ae gi >O 
A) = , 
f(a) ‘ otherwise 
Now S(X pace XqlA) = [Pde a ate AE it = ren 


il 


where m =)". , x,. Then, by Eqs. (7.12) and (7.13), 


S(%1, wad = [ I(% yee Xl ASA) dd 
lo 


= { Me-™ae*4 dd 
1 


« | 
= Ane etm dh= ae 
a [ e % (a + my} 
By Eq. (7.14), the posterior pdf of 4 is given by 
— ADA att jag—(atm)a 
SX yy voy Xp) S SX1y os HAAS) _ (a+ my" Tate OT 
S (Xs oe) Xp) n! 


Thus, by Eq. (7.15), the Bayes’ estimate of A is 
Ay = E(A| x1, 0.0, X,) = | AL (A| Xqy 0.4 X) dA 
0 


(at my*! 


© 
; | Art ty - tat ma di 
n! lo 


_ (atmy*) (n+ 1)! 
n! (a + my't? 


+1 41 
-77 "7 (7.46) 
a+im 
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and the Bayes’ estimator of 4 is 
Apo — tt = se 
a+n 
a+ DX, 


7.13. Let (X,,.. 


of p. 
The assumed prior pdf of y is 
! — 2/2 


Then by Eq. (7.12), 


IX Xe W=LlXp, > XLS 
1 - (x, — p)? 1 ~y2 
~ ny? el 5 2 le ° 
fn x,? 
wo(-i.%) | 
EP) af 92%(0- 2 £4) 


= (2x)" +12 


n x? ] n 2 
exp ~ 2 2) exe 55 +1) (3) exp| - (n+ 1) (. _ 
2 


i=1 2 
= (2n)"* 


Then, by Eq. (7.14), the posterior pdf of y is given by 
f Mipeeea May B) _ 


SlULXp vos) = =e 
[ I(X py ees Xny Wd 


=Ce — ~ 
| 2 as im) 
where C = C(x,, ..., x,) is independent of yu. However, Eq. (7.48) is just the pdf of a normal r.v. with mean 
I n 
nt I (E.) 


l 


and variance 
n+1 


yx 


Hence, the conditional distribution of u given x,, 
( 1 ) 


and variance 


Thus, the Bayes’ estimate of y is given by 
I 
Lx 


"-shbl 
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(7.47) 


., X,) be a random sample of a normal r.v. X with unknown mean yp and variance 1. 
Assume that p is itself to be a normal r.v. with mean 0 and variance 1. Find the Bayes’ estimator 


., X, 1S the normal distribution with mean 


(7.49) 
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7.14, 


7.15. 
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and the Bayes’ estimator of p is 
Mp=—~ ¥.X,-—"- 8 (7.50) 
Pent & tnt , 
Let (X,, ..., X,) be a random sample of a rv. X with pdf f(x; 6), where @ is an unknown 


parameter. The statistics L and U determine a 100({i — «) percent confidence interval (L, U) for 
the parameter 6 if 


PL<@<U)>1-a24 O<a<1 (7.51) 


and i —« is called the confidence coefficient. Find L and U if X is a normal r.v. with known 
variance o* and mean p is an unknown parameter. 


If X = N(p; 07), then 


is a standard normal r.v., and hence for a given x we can find a number z,,, from Table A (Appendix A) 
such that 


X—u 
Pl —2Zyj)9 < HS < 2 J=l—a@ (7.52) 


o/./n 


For example, if 1 — « = 0.95, then z,,. = 29.925 = 1.96, and if 1 — a = 0.9, then z,,. = Z9.95 = 1.645. Now, 
recalling that o > 0, we have the following equivalent inequality relationships; 


ATH 
a//n 
~2y2(6/,/n) < X = p< zy2(/,/n) 
-X- Za;2(0//n) <-p<~-X+ Zaj2(O/a/n) 
and X + zy 9(0/./n) > p> X = zy,9(0/</n) 


Thus, we have 


Tey S < 25,2 


PLX — zy2(0//n) < p< X +.2,,(0//n)] = 1 — « (7.53) 
and so 


L=X-2z,,(o//n) and U=X +2,,(0//n) (7.54) 


Consider a normal r.v. with variance 1.66 and unknown mean uy. Find the 95 percent confidence 
interval for the mean based on a random sample of size 10. 


As shown in Prob. 7.14, for 1 — « = 0.95, we have Z,,. = Z9.925 = 1.96 and 
Z4;(G/</n) = 1.96(./1.66/./10) = 0.8 
Thus, by Eq. (7.54), the 95 percent confidence interval for p is 
(X — 0.8, X + 0.8) 


MEAN SQUARE ESTIMATION 


7.16. 


Find the m.s. estimate of a r.v. Y by a constant c. 


By Eq. (7.17), the m.s. error is 


e=E[(Y —c)] = { (y — ef{y) dy (7.55) 


-«a 
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Clearly the m.s. error e depends on c, and it is minimum if 


de _ 2 |" dy =0 
mo 7 NF) dy = 


or c | Sly)dy=c -| yh (y) dy 


Thus, we conclude that the m.s. estimate c of Y is given by 


joes | yf(y) dy = E(Y) (7.56) 
7.17. Find the m.s. estimator of ar.v. Y by a function g(X) of the r.v. X. 
By Eq. (7.17), the m.s. error is 


e= E{[Y — 9(X)]’} = [ [ Ly — g(x)Pf(x, y) dx dy 
Since f(x, y) =f(y|x) f(x), we can write 


e= [; sea [" Ly — avy) ay} dx (7.57) 


Since the integrands above are positive, the m.s. error e is minimum if the inner integrand, 
Ly — alx))’f(y |x) dy (7.58) 


is minimum for every x. Comparing Eq. (7.58) with Eq. (7.55) (Prob. 7.16), we see that they are the same 
form if c is changed to g(x) and f(y) is changed to f(y| x). Thus, by the result of Prob. 7.16 [Eq. (7.56)], we 
conclude that the m.s. estimate of Y is given by 


p=a0=| “yf dy = E(Y |x) (7.59) 


— a 
Hence, the m.s. estimator of Y is 


¥ = g(X) = E(Y|X) (7.60) 


7.18. Find the ms. error if g(x) = E(Y |x) is the m.s. estimate of Y. 


As we see from Eq. (3.58), the conditional mean E(Y |x) of Y, given that X = x, is a function of x, and 
by Eq. (4.39), 


E[LE(Y | X)] = E(Y) (7.61) 


Similarly, the conditional mean E[g(X, Y)|x] of g(X, Y), given that X = x, is a function of x. It defines, 
therefore, the function E[g(X, Y)|X] of the r.v. X. Then 


E(E[g(X, Y)|X]} = | ° [ | * x, WL9 ay [yoo dx 


-— © ~ 


7 | . i lx, Fy LO F(x) dx dy 


= | [ g(x, y) f(x, y) dx dy = E[g(X, Y)] (7.62) 


Note that Eq. (7.62) is the generalization of Eq. (7.61). Next, we note that 
Efg.(X)g2(¥)1x] = Elgi)gAY) ix] = gi(Elg(¥) |x] (7.63) 
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Then by Eqs. (7.62) and (7.63), we have 
E(gi(X)92(¥)] = E{E(g(X)g2(¥)| X]} = Ela (X)E(92(¥)| X]} (7.64) 
Now, setting g,(X) = g(X) and g,(Y) = Y in Eq. (7.64), and using Eq. (7.18), we obtain 
E(g(X)¥) = Elg(X)E(Y | X)] = E[g?(X)) 
Thus, the m.s. error is given by 


e = E{[Y — (X)}?} = E(¥’) — 2E[g(X)¥] + Elg?(X)] 
= E(Y’) — E[g’(X)) (7.65) 


7.19. Let Y = X? and X be a uniform r.v. over (— 1, 1). Find the m.s. estimator of Y in terms of X and 
its m.s. error. 


By Eq. (7./8), the m.s. estimate of Y is given by 
gx) = E(Y |x) = E(X?|X = x) = x? 
Hence, the m.s. estimator of Y is 
Y=xX? (7.66) 


The mss. error is 


e = EXLY — g(X)} = E{LX? — X77} =0 (7.67) 


LINEAR MEAN SQUARE ESTIMATION 
7.20. Derive the orthogonality principle (7.21) and Eq. (7.22). 
By Eq. (7.20), the m.s. error is 
e(a, b) = E{LY — (aX + b)]?} 


Clearly, the m.s. error e is a function of a and 6, and it is minimum if de/da = 0 and de/db = 0. Now 


a= E(2CY — (aX + b)\(-—X)} = —2E(LY — (aX + b))X} 
a = E{2[Y — (aX + b)])(—1)} = -2E{LY — (aX + 5} 
Setting de/éa = 0 and ée/db = 0, we obtain 
E{{Y — (aX + b)]X} =0 (7.68) 
ELY — (aX +b] =0 (7.69) 


Note that Eq. (7.68) is the orthogonality principle (7.21). 
Rearranging Eqs. (7.68) and (7.69), we get 


E(X2)a + E(X)b = E(XY) 
E(X)a + 6 = E(Y) 
Solving for a and b, we obtain Eq. (7.22); that is, 
_ E(XY)— E(X)E(Y) oxy _ oy 


~ EX?) — [EQN oy? ay OY 
b = E(Y) — aE(X) = py — apy 
where we have used Eqs. (2.31), (3.51), and (3.53). 


7.21. Show that m.s. error defined by Eq. (7.20) is minimum when Eqs. (7.68) and (7.69) are satisfied. 
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7.22, 


7.23. 


7.24, 


Assume that ¥ = cX + d, where c and d are arbitrary constants. Then 
e(c, d) = E{LY — (cX + d)]}’} = E{LY — (aX + 6) +(a—o)X + (b-—A@]*} 
= E{[Y —(aX + 6)]?} + E{f(a — 0X + (b — a)}’} 
+ 2a — c)E{LY — (aX + b)]X} + Ab — DE{LY — (aX + b)]} 
= ea, b) + E{[(a— o)X +(b- dP} 
+ 2(a — c)E{LY — (aX + b)]X} + 2b — dE{LY — (aX + b)]} 


The last two terms on the right-hand side are zero when Eqs. (7.68) and (7.69) are satisfied, and the second 
term on the right-hand side is positive if a #c and b # d. Thus, e(c, d) > e(a, b) for any c and d. Hence, 
e(a, b) is minimum. 


Derive Eq. (7.23). 
By Eas. (7.68) and (7.69), we have 
R{LY — (aX + b)JaX} =0 = E{LY ~ (aX + b)}b} 


Then en = (a, b) = E{LY — (aX + 6)]?} = E{LY — (aX + b)]LY — (aX + b)]} 
= E{LY — (aX + b)]Y} = E(¥?) — aE(XY) — bE(Y) 


Using Eqs. (2.32), (3.51), and (3.53), and substituting the values of a and b [Eq. (7.22)] in the above expres- 
sion, the minimum m.s. error is 


Cm = Oy? + My? — Aoxy + Hy Hy) ~ (Hy — GLy)y 


Oxy" Oxy? 
—,2 — 72 xY _,2 xY —q72 2 
= Oy" — ayy = Oy” ——~y = Oy" | | — —3—G J = y"(] — Pry’) 
Oy Cy 


which is Eq. (7.23). 


Let Y = X?, and let X be a uniform r.v. over (—1, 1) (see Prob. 7.19). Find the linear m.s. 
estimator of Y in terms of X and its m-s. error. 


The linear m.s. estimator of Y in terms of X is 


¥=aX+b 
where a and b are given by [Eq. (7.22)] 
a= a b = py — apy 
Now, by Eqs. (2.46) and (2.44), 
hy = E(X) = 0 


E(X Y) = E(XX?) = E(X3) = ; [= dx =0 
By Eq. (3.51), 
Oxy = Cov(X Y) = E(XY) — E(X)E(Y) =0 
Thus, a = 0 and b = E(Y), and the linear m.s. estimator of Y is 
Y=b=E(Y) (7.70) 
and the m.s. error is 


e = E{LY — E(Y)}} = 6,? (7.71) 


Find the minimum m.s. error estimator of Y in terms of X when X and Y are jointly normal 
IV's. 
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7.25. 


7.26. 


7.27. 


7.28. 


7.29. 
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By Eq. (7./8), the minimum m.s. error estimator of Y in terms of X is 
¥ = E(Y|X) 
Now, when X and Y are jointly normal, by Eq. (3./08) (Prob. 3.51), we have 
o o 
E(Y |x) = pxy — X + wy — Pxy — bx 
ox ox 


Hence, the minimum m.s. error estimator of Y is 


a, o o 
¥ = E(Y |X) = pxy —* X + By — pay — bx (7.72) 
ox Gx 


Comparing Eq. (7.72) with Eqs. (7.19) and (7.22), we see that for jointly normal r.v.’s the linear m.s. estima- 
tor is the minimum m.s. error estimator. 


Supplementary Problems 


Let (X,, ..., X,) be a random sample of X having unknown mean uy and variance o*. Show that the 
estimator of o? defined by 


S?= v(x, X) 


n~1,> 


where X is the sample mean, is an unbiased estimator of a7. Note that S,? is often called the sample 
variance. 


Hint: Show that S,2 =— 


i S?, and use Eq. (7.29). 
n— 


Let (X,,..., X,) be a random sample of X having known mean p and unknown variance o”. Show that the 
estimator of o? defined by 


1 n 
S,? =~ D(X) — py 
nia 
is an unbiased estimator of o?. 
Hint: Proceed as in Prob. 7.2. 
Let (X,,..., X,) be a random sample of a binomial r.v. X with parameter (m, p), where p is unknown. Show 


that the maximum-likelihood estimator of p given by Eq. (7.34) is unbiased. 
Hint: Use Eq. (2.38). 


Let (X,,..., X,) be a random sample of a Bernoulli r.v. X with pmf f(x; p) = p*(1 — p)'~*, x = 0, L, where 
p,0 < p < 1, is unknown. Find the maximum-likelihood estimator of p. 


12 os 
Ans. Py, =- >,X,=X 
Mint 
The values of a random sample, 2.9, 0.5, 1.7, 4.3, and 3.2, are obtained from a rv. X that is uniformly 
distributed over the unknown interval (a, 5). Find the maximum-likelihood estimates of a and b. 


Ans. Gy, = min x, = 0.5, but = max x, = 4.3 
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7.30. 


731. 


7.32. 


7,33. 


7.34. 


7.35. 


In analyzing the flow of traffic through a drive-in bank, the times (in minutes) between arrivals of 10 
customers are recorded as 3.2, 2.1, 5.3, 4.2, 1.2, 2.8, 6.4, 1.5, 1.9, and 3.0. Assuming that the interarrival time 
is an exponential r.v. with parameter A, find the maximum likelihood estimate of A. 


! 
. dy =— 
Ans. Aw = 376 


Let (X,, ..., X,) be a random sample of a normal r.v. X with known mean pu and unknown variance o?. 
Find the maximum likelihood estimator of o?. 


Mes 


I 
Ans. Sxgp? = 7 (X, — py 
i=l 


Let (X,, ..., X,) be the random sample of a normal r.v. X with mean p and variance o?, where yu is 
unknown. Assume that x is itself to be a normal r.v. with mean p, and variance o,7. Find the Bayes’ 
estimate of yu. 


. Hy | Ax 1 n . ti 
Ans. jig = 32° a ope x=T 


Let (X,,..., X,,) be the random sample of a normal tr.v. X with variance 100 and unknown yw. What sample 
size nis required such that the width of 95 percent confidence interval is 5? 


Ma 
ot 


Ans. n= 62 


Find a constant a such that if Y is estimated by aX, the m.s. error is minimum, and also find the minimum 
M.S. error €,,- 


Ans. a= E(XY)/E(X?) ém = E(Y¥?) — LEXY) PLEX)? 


Derive Eqs. (7.25) and (7.26). 
Hint: Proceed as in Prob. 7.20. 


Chapter 8 


Decision Theory 


81 INTRODUCTION 


There are many situations in which we have to make decisions based on observations or data 
that are random variables. The theory behind the solutions for these situations is known as decision 
theory or hypothesis testing. In communication or radar technology, decision theory or hypothesis 
testing is known as (signal) detection theory. In this chapter we present a brief review of the binary 
decision theory and various decision tests. 


8.2 HYPOTHESIS TESTING 
A. Definitions: 


A statistical hypothesis is an assumption about the probability law of r.v.’s. Suppose we observe a 
random sample (X,,..., X,) of ar.v. X whose pdf f(x; @) = f(x,, ..., x,; 9) depends on a parameter 0. 
We wish to test the assumption 6 = 6) against the assumption 6 = 6,, The assumption @ = 6, is 
denoted by Hy and is called the null hypothesis. The assumption @ = @, is denoted by H, and is called 
the alternative hypothesis. 


Hy: @=6, (Null hypothesis) 
H,: @=86, (Alternative hypothesis) 


A hypothesis is called simple if all parameters are specified exactly. Otherwise it is called compos- 
ite. Thus, suppose Hy: @ = 4) and H,: 0 # 8; then Ho is simple and H, is composite. 


B. Hypothesis Testing and Types of Errors: 


Hypothesis testing is a decision process establishing the validity of a hypothesis. We can think of 
the decision process as dividing the observation space R” (Euclidean n-space) into two regions Ry and 
R,. Let x =(x,,..., x,) be the observed vector. Then if x € Rg, we will decide on Hy; if x € R,, we 
decide on H,. The region Ry is known as the acceptance region and the region R, as the rejection (or 
critical) region (since the null hypothesis is rejected). Thus, with the observation vector (or data), one 
of the following four actions can happen: 


H true; accept Ho 
Ho true; reject Hy (or accept H,) 
Hi, true; accept A, 


Fen PS 


H, true; reject H, (or accept Ho) 


The first and third actions correspond to correct decisions, and the second and fourth actions corre- 
spond to errors. The errors are classified as 


1. Typelerror: Reject Hy (or accept H,) when Ho is true. 
2. Type llerror: Reject H, (or accept Hy) when H, is true. 


Let P, and Py, denote, respectively, the probabilities of Type I and Type II errors: 
P, = P(D,| Ho) = P(x € Ry; Ho) (8.1) 
Py = P(Do| H,) = P(x € Ro; Ay) (8.2) 
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where D, (i = 0, t) denotes the event that the decision is made to accept H;. P; is often denoted by a 
and is known as the level of significance, and Py is denoted by 8 and (1 — f) is known as the power of 
the test. Note that since « and B represent probabilities of events from the same decision problem, 
they are not independent of each other or of the sample size n. It would be desirable to have a 
decision process such that both « and § will be small. However, in general, a decrease in one type of 
error leads to an increase in the other type for a fixed sample size (Prob. 8.4). The only way to 
simultaneously reduce both type of errors is to increase the sample size (Prob. 8.5). One might also 
attach some relative importance (or cost) to the four possible courses of action and minimize the total 
cost of the decision (see Sec. 8.3D). 
The probabilities of correct decisions (actions 1 and 3) may be expressed as 


P(Do| Ho) = P(x € Ro; Ho) (8.3) 
P(D,|H,) = P(x € Ry; H,) (8.4) 
In radar signal detection, the two hypotheses are 
H,: No target exists 
H,: Target is present 


In this case, the probability of a Type 1 error P, = P(D,| Ho) is often referred to as the false-alarm 
probability (denoted by P,), the probability of a Type H error Py = P(D)|H,) as the miss probability 
(denoted by P,,), and P(D,|H,) as the detection probability (denoted by Pp). The cost of failing to 
detect a target cannot be easily determined. In general we set a value of P; which is acceptable and 
seek a decision test that constrains P, to this value while maximizing P, (or equivalently minimizing 
Py). This test is known as the Neyman-Pearson test (see Sec. 8.3C). 


8.3. DECISION TESTS 
A. Maximum-Likelihood Test: 


Let x be the observation vector and P(x|H,), i = 0.1, denote the probability of observing x given 
that H; was true. In the maximum-likelihood test, the decision regions Ry and R, are selected as 


Ro = {x: P(x| Ho) > P(x} H,)}} 


8.5 
R, = {x: P(x| Ho) < P(x| H;)} (8.5) 
Thus, the maximum-likelihood test can be expressed as 
H if P(x| Ho) > P(x|H,) 
dx)=4 °° 8.6 
” ti if P(x | Ho) < P(x| Hy) 6) 
The above decision test can be rewritten as 
P(x| Hy) % 
8.7) 
P(x| Ho) no 
If we define the likelihood ratio A(x) as 
P(x|H,) 
A(x) = ——— (8.8) 
= Bx| Ho) 
then the maximum-likelihood test (8.7) can be expressed as 
Ai 
A(x) 21 (8.9) 


Ho 
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which is called the likelihood ratio test, and 1 is called the threshold value of the test. 
Note that the likelihood ratio A(x) is also often expressed as 


_ f(x| Ay) 


AO) = FC Hy) 


(8.10) 


B. MAP Test: 


Let P(H;|x), i= 0, 1, denote the probability that H; was true given a particular value of x. The 
conditional probability P(H;|x) is called a posteriori (or posterior) probability, that is, a probability 
that is computed after an observation has been made. The probability P(H;), i = 0, 1, is called a priori 
(or prior) probability. In the maximum a posteriori (MAP) test, the decision regions Ry and R, are 
selected as 


Ry = {x: P(Ho|x) > P(A, 


x)} 


8.11 
R, = {x: P(Ho|x) < P(H,|x)} 
Thus, the MAP test is given by 
H if P(Hg|x) > P(H,|x) 
d(x) =4_° 12 
09 ty if P(Ho|x) < PUH, |x) (8.12) 
which can be rewritten as 
PU, |x) 
1 8.13 
PHo|®) a (815) 


Using Bayes’ rule (Eq. (1.42)], Eq. (8.13) reduces to 


Ay 
P(x|H PH) (8.14) 
P(x| Ho)P(Ho) no 
Using the likelihood ratio A(x) defined in Eq. (8.8), the MAP test can be expressed in the following 
likelihood ratio test as 


Hi PH) 
z= = 
A(x) 2"= 2G) (8.15) 


where » = P(H,)/P(H,) is the threshold value for the MAP test. Note that when P(H,) = P(H,), the 
maximum-likelihood test is also the MAP test. 


C. Neyman-Pearson Test: 


As we mentioned before, it is not possible to simultaneously minimize both «(= P,) and f( = Py). 
The Neyman-Pearson test provides a workable solution to this problem in that the test minimizes B 
for a given level of a. Hence, the Neyman-Pearson test is the test which maximizes the power of the 
test 1 — B for a given level of significance a. In the Neyman-Pearson test, the critical (or rejection) 
region R, is selected such that 1 — B = 1 — P(D)|H,) = P(D,|H,) is maximum subject to the con- 
Straint a = P(D,| Ho) = a. This is a classical problem in optimization: maximizing a function subject 
to a constraint, which can be solved by the use of Lagrange multiplier method. We thus construct the 
objective function 


J =(1— B) — Aa — ay) (8.16) 


where A > 0 is a Lagrange multiplier. Then the critical region R, is chosen to maximize J. It can be 
shown that the Neyman-Pearson test can be expressed in terms of the likelihood ratio test as 
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(Prob. 8.8) 
Ay 


A(Xx)2n=A (8.17) 


Ho 


where the threshold value 7 of the test is equal to the Lagrange multiplier 4, which is chosen to satisfy 
the contraint « = a. 


D. Bayes’ Test: 


Let C;, be the cost associated with (D,, H,), which denotes the event that we accept H, when H, is 
true, Then the average cost, which is known as the Bayes’ risk, can be written as 


C = Coo P(Do, Ho) + Cyo P(Dy, Ho) + Cor P(Do, Hy) + Cui P(D1, Ay) (8.18) 
where P(D;, H,) denotes the probability that we accept H; when H, is true. By Bayes’ rule (1.42), we 
have 

C = Coo P(Do| Ho)P(Ho) + Cro P(D1 | Ho) P(Ho) + Cor P(Do| H1)P(H1) + Cy:P(Di|Hy)P(H1) (8.19) 
In general, we assume that 
Cio > Coo and Cor > Cy (8.20) 


since it is reasonable to assume that the cost of making an incorrect decision is higher than the cost of 

making a correct decision. The test that minimizes the average cost C is called the Bayes’ test, and it 

can be expressed in terms of the likelihood ratio test as (Prob. 8.10) 
St (C10 = Coo)P(Ho) 


MO) S0= Eh C1) PU) 


Note that when Cig — Cog = Co, — Cy, the Bayes’ test (8.21) and the MAP test (8.15) are identical. 


(8.21) 


E. Minimum Probability of Error Test: 
If we set Cop = C,, =O and Co, = Cy = 1 in Eq. (8.18), we have 
C = P(D,, Ho) + P(Do, Hi) = P. (8.22) 


which is just the probability of making an incorrect decision. Thus, in this case, the Bayes’ test yields 
the minimum probability of error, and Eq. (8.21) becomes 


(8.23) 


We see that the minimum probability of error test is the same as the MAP test. 


F. Minimax Test: 


We have seen that the Bayes’ test requires the a priori probabilities P(H)) and P(H,). Frequently, 
these probabilities are not known. In such a case, the Bayes’ test cannot be applied, and the following 
minimax (min-max) test may be used. In the minimax test, we use the Bayes’ test which corresponds 
to the least favorable P(H,) (Prob. 8.12). In the minimax test, the critical region Rf is defined by 

max C[P(H,), R*] = min max C[P(H)), R,] < max C[P(H,), R,] (8.24) 


P(Ho) Ri P(Ho) P(Ho) 
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for all R, # R¥. In other words, Rf is the critical region which yields the minimum Bayes’ risk for the 
least favorable P(H,). Assuming that the minimization and maximization operations are interchange- 
able, then we have 


min max C[P(H,), R,] = max min C[P(H,), Ry] (8.25) 


Ry P(Ho) P(Ho) Ri 


The minimization of C[P(H,), R,] with respect to R, is simply the Bayes’ test, so that 
min C[P(H,), Ri] = C*[P(Ho)] (8.26) 
Ry 


where C*[P(H)] is the minimum Bayes’ risk associated with the a priori probability P(H,). Thus, Eq. 
(8.25) states that we may find the minimax test by finding the Bayes’ test for the least favorable P(H,), 
that is, the P(H,) which maximizes C[ P(H)]. 


Solved Problems 


HYPOTHESIS TESTING 


8.1. Suppose a manufacturer of memory chips observes that the probability of chip failure is p = 0.05. 
A new procedure is introduced to improve the design of chips. To test this new procedure, 200 
chips could be produced using this new procedure and tested. Let r.v. X denote the number of 
these 200 chips that fail. We set the test rule that we would accept the new procedure if X < 5. 
Let 

Ho: p=0.05 (No change hypothesis) 
H,: p< 0.05 (Improvement hypothesis) 


Find the probability of a Type I error. 


If we assume that these tests using the new procedure are independent and have the same probability 
of failure on each test, then X is a binomial r.v. with parameters (n, p) = (200, p). We make a Type I error if 
X <5 when in fact p = 0.05. Thus, using Eq. (2.37), we have 


P, = P(D,| Ho) = P(X <5; p = 0.05) 
5 (200 
=>¥ ( 054095700" 
k=0 \ K 
Since n is rather large and p is small, these binomial probabilities can be approximated by Poisson prob- 
abilities with 4 = np = 200(0.05) = 10 (see Prob. 2.40). Thus, using Eq. (2.100), we obtain 
5 1O* 
Pix Ye '°— = 0.067 
k=0 k! 


Note that Hy is a simple hypothesis but H, is a composite hypothesis. 


8.2. Consider again the memory chip manufacturing problem of Prob. 8.1. Now let 


Ho: p=0.05 (No change hypothesis) 
H,: p=0.02 (Improvement hypothesis) 


Again our rule is, we would reject the new procedure if X > 5. Find the probability of a Type II 
error. 
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Now both hypotheses are simple. We make a Type II error if X > 5 when in fact p = 0.02. Hence, by 


Eq. (2.37), 


Py = P(Dy|H,) = P(X > 5; p = 0.02) 


= x Ci 1 Joo (0,98)200-* 


Again using the Poisson approximation with 4 = np = 200(0.02) = 4, we obtain 


k 


54 
Pyxl— Ye *— = 0.215 
k=0 k! 


Let (X,,..., X,) be a random sample of a normal r.v, X with mean pu and variance 100. Let 


Hy: p= S50 
Ay: p= py (>50) 


and sample size n = 25. As a decision procedure, we use the rule to reject Hy if X > 52, where x is 
the value of the sample mean X defined by Eq. (7.27). 


(a) 
(0) 
(c) 


(a) 


Find the probability of rejecting Hy: » = 50 as a function of y (> 50). 
Find the probability of a Type I error «. 
Find the probability of a Type II error 8 (i) when yx, = 53 and (il) when p, = SS. 
Since the test calls for the rejection of Hy: # = 50 when X > 52, the probability of rejecting Hy is given 
by 

glu) = P(X > 52; p) (8.27) 
Now, by Eqs. (4.112) and (7.27), we have 


Var(X) = 0,7 = 


Thus, X is N(z; 4), and using Eq. (2.55), we obtain 


X- _ 52 — 
a = P( aH a BSH. y)=1~of =) p> 50 (8.28) 


The function g(z) is known as the power function of the test, and the value of g(z) at uh = 44, g(H,), is 
called the power at py. 

Note that the power at » = 50, g(50), is the probability of rejecting Hg: p = 50 when Hy is true—that 
is, a Type I error. Thus, using Table A (Appendix A), we obtain 


52 — 50 
2 


a= P= (50) =1~0f ) = 1-01 = 01587 


Note that the power at p = yy, g(H,), is the probability of rejecting Hy: uw = SO when wp = py. Thus, 
1 — g(u,) is the probability of accepting Hy when pz = p,— that is, the probability of a Type II error B. 


(i) Setting p = pw, = 53 in Eq. (8.28) and using Table A (Appendix A), we obtain 


B= Py = 1 — 9(53) = o( 5 *) = O(— $) = 1 — 04) = 0.3085 


(ii) Similarly, for p = yw, = 55 we obtain 


B= Py =1—9(55) = “(= 5 *) = O(— 3) = 1 — 0(3) = 0.0668 


Notice that clearly, the probability of a Type IJ error depends on the value of p,. 
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8.4. Consider the binary decision problem of Prob. 8.3. We modify the decision rule such that we 
reject Hy, ifx >. 


(a) Find the value ofc such that the probability of a Type I error « = 0.05. 
(b) Find the probability of a Type II error 8 when uw, = 55 with the modified decision rule. 


(a) Using the result of part (6) in Prob. 8.3, c is selected such that [see Eq. (8.27)] 
a = 9(50) = P(X > c; uw = 50) = 0.05 
However, when » = 50, X¥ = N(50; 4), and [see Eq. (8.28)] 


¥-50_ c—50 — 50 
1650) = P( ; 2S y= 50)=1-0(5S ) = 005 


From Table A (Appendix A), we have (1.645) = 0.95. Thus 


ec — 50 


7 = 1.645 and ¢ = 50 + 2(1.645) = 53.29 


(b) The power function g() with the modified decision rule is 
- _ $3.29 — 53.29 — 
g(u) = P(X = 53.29; p) = o( <= > = 1) =1- (22-4) 


Setting # = p, = 55 and using Table A (Appendix A), we obtain 


B = Py = 1 — 9(55) = (22 —*) = 0.855) 


= 1 — (0.855) = 0.1963 


Comparing with the results of Prob. 8.3, we notice that with the change of the decision rule, o is 
reduced from 0.1587 to 0.05, but f is increased from 0.0668 to 0.1963. 


8.5. Redo Prob. 8.4 for the case where the sample size n = 100. 


(a) With n = 100, we have 


Var(X) = 6,2 =-a? = 


! 
n 
As in part (a) of Prob. 8.4, c is selected so that 

a = 9(50) = P(X > c; wp = 50) = 0.05 
Since X = N(50; 1), we have 


X — 50 — 50 
o(50) = >f = 4 50) = 1 ~ @4e ~ 50) = 003 


Thus c — 50 = 1.645 and c = 51.645 
(b) The power function is 


g(y) = P(X > 51.645; p) 
= p(AoH 2 HEB as 


; 1) = | — 0(51.645 — y) 


Setting w = », = 55 and using Table A (Appendix A), we obtain 
8B = Py = 1 — g(55) = (51.645 — 55) = &( -3.355) = 0.0004 
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Notice that with sample size n = 100, both « and f have decreased from their respective original values 
of 0.1587 and 0.0668 when n = 25. 


DECISION TESTS 


8.6. In a simple binary communication system, during every T seconds, one of two possible signals 
So(t) and s,(t) is transmitted. Our two hypotheses are 


Hg:  So(t) was transmitted. 
H,: 5,(t) was transmitted. 
We assume that 
sot)}=0 and s\(th=1 O<t<T 


The communication channel adds noise n(t), which is a zero-mean normal random process with 
variance 1. Let x(t) represent the received signal: 


x(t) = s(t) + n(t) i=0,1 


We observe the received signal x(t) at some instant during each signaling interval. Suppose that 
we received an observation x = 0.6. 


(a) Using the maximum likelihood test, determine which signal is transmitted. 
(b) Find P, and Py. 
(a) The received signal under each hypothesis can be written as 
Hy: x=n 
Hy: x=ltn 
Then the pdf of x under each hypothesis is given by 
1 


2n 
l 


Te 


-al/2 


f(x] Ho) = e 


ge & bee 


f(x] H)) = 


The likelihood ratio is then given by 


A(x) = f(x} Ay) = ptt 1/2) 
f(x] Ho) 
By Eq. (8.9), the maximum likelihood test is 
Hy 
el 1/2) 2 1 
Ho 


Taking the natural logarithm of the above expression, we get 


HW Hy 
x~3$20 or x24 
Ho Ho 


Since x = 0.6 > 4, we determine that signal s,(t) was transmitted. 


(b) The decision regions are given by 


Ro = {x: x < 4} =(— ©, 4) R, = (x: x > 4} = (4, ©) 
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Then by Eas. (8./) and (8.2) and using Table A (Appendix A), we obtain 


] om 
P, = P(D,| Ho) = | S(x| Ho) dx = | e772 dx = 1 — @(4) = 0.3085 
Ry Vat ij 


12 
et IMI2 dy 


wy 


Py = PtDo|H.)= | f(x| Hy) dx = 
Ro 


v8 


1 


Pin 


~ 1/2 
| e7P?2 dy = &( — $) = 0.3085 


8.7. In the binary communication system of Prob. 8.6, suppose that P(Hy) = 2 and P(H,) = 4. 


(a) Using the MAP test, determine which signal 1s transmitted when x = 0.6. 
(b) Find P, and Pi. 
(a) Using the result of Prob. 8.6 and Eq. (8./5), the MAP test is given by 

wo P(A) 


Taking the natural logarithm of the above expression, we gel 
Hy Hy 
x—421n2 or x24+In2= 1.193 
Ho Ho 


Since x = 0.6 < 1.193, we determine that signal s)(t) was transmitted. 
(6) The decision regions are given by 
Ro = {x1 x < 1.193} = (—0o, 1,193) 
R, = {x: x > 1,193} = (1.193, oo) 
Thus, by Eqs. (8./) and (8.2) and using Table A (Appendix A), we obtain 


I 
/2n 


N 


P= PID, 1H)= | S(x| Ho) dx = | e 72 dy = 1 — (1.193) = 0.1164 
Ry 1.193 


l 1.193 
Py = P(Do|A )=[ S(x|A ) x= | eB O2 dy 
iu olfly ho 1 Ton . 


l 
Jin 


0.193 
| eP2 dy = O(0.193) = 0.5765 


8.8. Derive the Neyman-Pearson test, Eq. (8.17). 


From Eq. (8.1/6), the objective function is 
J = (1 ~ B) — Ma — %) = P(D, | H,) — ALP(D, | Ho) — %] (8.29) 


where 4 is an undetermined Lagrange multiplier which is chosen to satisfy the constraint « =a). Now, we 
wish to choose the critical region R, to maximize J. Using Eqs. (8./) and (8.2), we have 


J -| fiji ax — 4 | fl} He) ax ~ a 
Ry Ry 


= | [A(x] Hy) — Af (x| Ao)] dx + Aa (8.30) 
Ry 
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To maximize J by selecting the critical region R,, we select x € R, such that the integrand in Eq. (8.30) is 
positive. Thus R, is given by 


R, = {x: [f(x] A,) — Af(x| Ho)] > 0} 
and the Neyman-Pearson test is given by 


_ AMA) a 


A = 
0) = FA) = 


and A is determined such that the constraint 
a= P\= PtD,|H)= [ L(X| Ho) dx = % 
Ry 


is satisfied. 


8.9. Consider the binary communication system of Prob. 8.6 and suppose that we require that « = 


P, = 0.25. 
(a) Using the Neyman-Pearson test, determine which signal is transmitted when x = 0.6. 
(b} Find Py. 


(a) Using the result of Prob, 8.6 and Eq. (8.17), the Neyman-Pearson test is given by 
Ay 
ef 7 1/2) 2 A 
Ho 


Taking the natural logarithm of the above expression, we get 


Hy Hy 
x-$2 Ind oor x2$4Ind 
Ho Ho 


The critical region R, is thus 
R, ={x:x>4+4In 4} 
Now we must determine A such that a = P; = P(D,| Ho) = 0.25. By Eq. (8.1), we have 
{ * 1 
P, = P(D,| Ho) -| S(x| Ho) dx = = | e 7? dy = | - of; +In 1) 
Ry V Th vy 2 


/2+Ina 


Thus 1 — @(3 + In A) = 0.25 or O(5 + In 4) = 0.75 
From Table A (Appendix A), we find that 0(0.674) = 0.75. Thus 
$4 ind =067454= 1,19 


Then the Neyman-Pearson test is 


Hy 
x 20.674 


No 
Since x = 0.6 < 0.674, we determine that signal so(t) was transmitted. 
(vb) By Eq. (8.2), we have 


0.674 
Py= PDo| ti) | fost) de = eR dy 
Ro af 2n -a 
1 
Vin 


~0.326 
| e7/2 dy = (0.326) = 0.3722 


8.10. Derive Eq. (8.21). 
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By Eq. (8.19), the Bayes’ risk is 
€ = Coo P(Do| Ho)P(Ho) + Cro P(D, | Ho)P(Ho) + Cor P(Dol Hy )P(H) + Cy, P(D, | H P(A) 


Now we can express 
Pop, lH) = | fowl Hy ax i=0,1;j/=0,1 (8.31) 
Ri 
Then C can be expressed as 
C = Coo P(Ho) { S(x| Ho) dx + CioPUH) | S(x| Ho) dx 
0 1 


(8.32) 
+ courte) | S(x| Hy) dx + crt) | J(x| Hy) dx 


Since Ry U R, = Sand Ry m1 R, = ¢, we can write 


| S(x|H,) dx = [reiny dx — | S(x|H) dx =1 -| S(x|H,) ax 
Ro IS Ry Ry 
Then Eq. (8.32) becomes 
C = Coo P(Ho) + Coy P(H,) + | {[(Cro — Coo)P(Ho) f (x1 Ho) — [Cor — Crs)P(Ay) L(x H,)3} dx 
Ry 


The only variable in the above expression is the critical region R,. By the assumptions [Eq. (8.20)} C5 > 
Coo and Cy, > C,,, the two terms inside the brackets in the integral are both positive. Thus, C is mini- 
mized if R, is chosen such that 


(Cor — Cy )P(A)S(&) Ay) > (Cro ~ Coo)P( Ho) f(x | Ho) 
for all x € R,. That is, we decide to accept H, if 

(Coy — Ci )PCH F(x] Hy) > (Cro — Coo) Pl Ho) f(x | Ho) 
In terms of the likelihood ratio, we obtain 


_ S(x(H,) 2 (Cro > Coo) P(H 0) 
S(X(Ho) my (Cor ~— Cis)PUH,) 


A(x) 


which is Eq. (8.22). 


Consider a binary decision problem with the following conditional pdf’s: 

f(x| Ho) = ze" 

f(x] Hy) = e772 
The Bayes’ costs are given by 

Cop = Cy, = 9 Co, = 2 Ciyg=1 
(a) Determine the Bayes’ test if P(H,)) = 3 and the associated Bayes’ risk. 
(b) Repeat (a) with P(H,) = 4. 
(a) The likelihood ratio is 
A(x) = an = ws = 2e7" (8.33) 
By Eq. (8.21), the Bayes’ test is given by 

(OF on 


ae“ 2 = or e PI24 
Ho (2 ~~ OF; Ho 
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Taking the natural logarithm of both sides of the last expression yields 
Hy 


|x| $ — In($) = 0.693 
Ho 
Thus, the decision regions are given by 


Ro = {x: |x| > 0.693} R, 
0.693 0.693 
Then P= PID, 1Ho)= | Lent dy =2| de-* dx = 0.5 
- 0.693 0 
-0.693 
Py = PDH) = | 


eax | em dxe2 |” 
-2 0.693 lo 
and by Eq. (8./9), the Bayes’ risk is 


.693 


{x: |x] < 0.693} 


e ?* dx = 0.25 
C = P(D,| Ho)P(Ho) + 2P(Do| H,)P(H,) = (0.5)(%) + 2(0.25K4) = 0.5 
(b) The Bayes’ test is 


Ay 
=i or e7)*l 2 1 
Hy (2—O)%* Ho 
Again, taking the natural logarithm of both sides of the last expression yields 


Hy 


|x| $ —In(4) = 1.386 
Ho 
Thus, the decision regions are given by 


Ro = (x: |x| > 1.386} 
Then 


R, = {x: |x| < 1.386} 
1,386 
P.= PID, 1H) =2 [ 

lo 


ge-* dx = 0.75 
Py = P(Dy|H,) =2 | e~?* dx = 0.0625 

1.386 
and by Eq. (8.19), the Bayes’ risk is 


C = (0.75)(4) + 2(0.0625\4) = 0.4375 


8.12. Consider the binary decision problem of Prob. 8.11 with the same Bayes’ costs. Determine the 
minimax test. 


From Eq. (8.33), the likelihood ratio is 


Atay = LOND) _ 


2¢e7'! 
f(x | Ho) 
In terms of P(H,), the Bayes’ test [Eq. (8.2/)] becomes 


H H 

yet Po) gg EH 
Ho 2 1 ~ P(Ho) Ho 4.1 — P(Ro) 

Taking the natural logarithm of both sides of the last expression yields 

yy 


4[1 — P(H)] 
$1 =6 
|x] = n PH) 


Ro = {x: |x] > 5} 


(8.34) 
For P(H,) > 0.8, 5 becomes negative, and we always decide H,. For P(H)) < 0.8, the decision regions are 


R, = {x: |x| < 5} 
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Then, by setting Cog = C,, =9, Co, = 2, and C,, = 1 in Eq. (8.19), the minimum Bayes’ risk C* can be 
expressed as a function of P(H,) as 


é —38 rol 
C*[P(Ho)] = P(A) { te7 | dx + 2f1 — ruta | e2* dx + { e?* ax| 
~5 5 


6 fa) 
= P(H,) { e-* dx + 4[1 — P(H,)] { e7? dx 
0 5 
= P(H,)\(1 — e°°) + 2[1 — P(H,)Je7 4 (8.35) 
From the definition of 6 [Eq. (8.34)], we have 
s_ 4{1 = PlHo)] 
P(H,) 
. P(Ho) - P?(Ho) 
a 26 
Thus Oa Pug) 8 ett PHO 
Substituting these values into Eq. (8.35), we obtain 
8P(Hy) — 9P*(Ho) 
8[1 ~ P(H)] 


Now the value of P(H,) which maximizes C* can be obtained by setting dC*[P(H)]/dP(H,) equal to zero 
and solving for P(H,). The result yields P(H,) = 3. Substituting this value into Eq. (8.34), we obtain the 


following minimax test: 


C*[P(Ho)] = 


2 = 0.69 


Suppose that we have n observations X,, i= 1, ..., n, of radar signals, and X,; are normal iid 
r.v.’s under each hypothesis. Under Hy, X; have mean yp and variance a*, while under H,, X; 
have mean p, and variance a”, and 4, > uo. Determine the maximum likelihood test. 


By Eq. (2.52) for each X,, we have 


1 1 ; 
Sf (%i| Ho) = Vino exe] - 3g2 (x; — Ho) | 


! 


1 
S(x;|H,) = Vine enol 55 (x; — ms] 


Since the X; are independent, we have 
n 1 J n 
f(X| Ho) = ul f (%;) Ho) = Jine exp| ~ 7g? >» (x; — 1) 
J n 
S(x| Hy) = nee | A, )= ez on - sz Stun] 


With py — uo > 0, the likelihood ratio is then given by 


H, 1 . 
A(x) = aay = expt bea — Ho)Xi — nu? _ 10 | 


Hence, the maximum likelihood test is given by 


Ay 


1 
exp 2 b 2(Hy ~ Ho)X; — mH? — Ho |} 21 


Ho 


CHAP. 8] DECISION THEORY 


8.14. 


8.15. 


Taking the natural logarithm of both sides of the above expression yields 


note yx ze 


Ho 


n(,? — Ho’) 
2 


or 
Equation (8.36) indicates that the statistic 


1 n 
(X, 0, X) = YX ak 
i=1 
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provides enough information about the observations to enable us to make a decision. Thus, it is called the 


sufficient statistic for the maximum likelihood test. 


Consider the same observations X,;, i = 1,..., , of radar signals as in Prob. 8.13, but now, under 
Hy, X; have zero mean and variance o,?, while under H,, X; have zero mean and variance o,?, 


and ¢,” > @”. Determine the maximum likelihood test. 


In a similar manner as in Prob. 8.13, we obtain 


S(x| Ho) = Ona," exo( - 9.8 >» x) 


S(x|H,) = Ona, exo( - ig 3 «) 


With a,? — a,? > 0, the likelihood ratio is 


(KIA) (oo\ 
N00 = Fx Ho) -(2) « 


and the maximum likelihood test is 


a 2 2 n Hy 

Fo go, — Go ‘| > 
—] exp] {——--- Xx; 21 

(=) ( 2a,70,7 ) > Ho 


Taking the natural logarithm of both sides of the above expression yields 


n a , 2a970,? 
¥ x,;?7 2 n| Inf — Ss 
i=l Ho Go a," — % 


Note that in this case, 


is the sufficient statistic for the maximum likelihood test. 


(8.37) 


In the binary communication system of Prob. 8.6, suppose that we have n independent obser- 


vations X; = X(t), i= 1,...,n,whereO<t) <°+:+ <1, < T. 


(a) Determine the maximum likelihood test. 
(b) Find P, and Py for n = 5 and n = 10. 


(a) Setting zo = O and p, = | in Eq. (8.36), the maximum likelihood test is 


t=. 


spe 
ewe 
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(b) Let y= 5X, 


i=1 


Then by Eqs. (4.108) and (4./12), and the result of Prob. 5.60, we see that Y is a normal r.v. with zero 
mean and variance |/n under Hy, and is a normal r.v. with mean | and variance |/n under H,. Thus 


P, = P(D,| Ho) = [ foits dy ~~ [lew dy 


! [" 72/2 
= e772 dz = 1 — @(,/n/2) 
/2n vnf2 
Jn 


Py = P(Do| Hy) -| Sy |H,) dy = { eA? dy 
Ro ® Jy/2 


~ Jnj2 
=| e 72 dz = O—/n/2) = 1 — O/n/2) 
te —@ 


Note that P; = P;,. Using Table A (Appendix A), we have 
P, = Py = 1 — (1.118) = 0.1318 for n= 5 
P, = Py = 1 — B(1.581) = 0.057 for n = 10 


8.16. In the binary communication system of Prob. 8.6, suppose that s,(t) and s,(t) are arbitrary 
signals and that n observations of the received signal x(t) are made. Let n samples of so(t)} and 
s,(t) be represented, respectively, by 


So = [S01 Soa» +++» Sond” and $1 = [8135 Spa) 00+) Sind” 
where T denotes “transpose of.” Determine the MAP test. 


For each X,, we can write 
1 
I(x; | Ho) = Jin exp| - 3 (x, - so | 
1 
f(x;|H,) = Tia exo] ~ 3 (x; — 5.0] 


Since the noise components are independent, we have 
S(xXlH)=T]S@14) j=9,1 
i=l 


and the likelihood ratio is given by 


S(x| Ay) _ 
S(x|Ho) 


A(x) = 


if 
iv] 
* 
so) 


es i =] 


= exo] SO — Soi)X; — 3 (547? — so] 


Thus, by Eq. (8./5), the MAP test is given by 


a H p 4 


Taking the natural logarithm of both sides of the above expression yields 


H 
Lu — sod, 2 2 In| 7 7] + 451? — 50:7 (8.38) 
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Supplementary Problems 


8.17. Let (X,,..., X,) be a random sample of a Bernoulli r.v. X with pmf 
F(x; p)=p(1—p)i* x =O1 
where it is known that 0 < p < 4. Let 
Hy: p=4 
Hy: p=p, (<4) 
and n = 20. As a decision test, we use the rule to reject Ho if )"_, x, < 6. 
(a) Find the power function g(p) of the test. 
(b) Find the probability of a Type | error «. 
(c) Find the probability of a Type IJ error 8 (i) when p, = 4 and (ii) when p, = qy. 


6 


Ans. (a) g(p) = p> (Pu — pyro-* O<p<4 


(b) « =0.0577;  (c) (i) B = 0.2142, (ii) B = 0.0024 


8.18. Let (X,,..., X,) be a random sample of a normal r.v. X with mean yz and variance 36. Let 
Hy: p=50 
Hi. w=55 


As a decision test, we use the rule to accept Ho if X < 53, where x is the value of the sample mean. 


(a) Find the expression for the critical region R,. 
(b) Find « and f for n = 16. 


] n 
Ans. (a) Ry = {(x,,..., Xx); ¥ = 53} where X=— ¥ x; 
Ny=i 
(b) «= 0.0228, 8 = 0.0913 
8.19. Let (X,,..., X,) be a random sample of a normal r.v. X with mean yp and variance 100. Let 
Hy: p= 50 
Hy: p=55 


As a decision test, we use the rule that we reject Hy if X > c. Find the value of c and sample size n such that 
a = 0.025 and B = 0.05. 


Ans. c= 52.718,n= 52 


8.20. Let X be a normal r.v. with zero mean and variance o7. Let 


Hy: o=1 
H,: o7=4 


Determine the maximum likelihood test. 


Ay 
Ans. |x| 2 1.36 
Ho 


8.21. Consider the binary decision problem of Prob. 8.20. Let P(H)) = 4 and P(H,) = 4. Determine the MAP 
test. 


Ay 
Ans. |x| 2 1.923 


Ho 
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8.22. Consider the binary communication system of Prob. 8.6. 


(a) Construct a Neyman-Pearson test for the case where « = 0.1. 
(b) Find ~. 


Hy 
Ans. (a) |x| 2 1.282; () B=06111 


Ho 
8.23. Consider the binary decision problem of Prob. 8.11. Determine the Bayes’ test if P(H,) = 0.25 and the 
Bayes’ costs are 


Coo = C11 = 9 Co =1 Cio = 2 
Hy 


Ans, |x| $ 1.10 
Ho 


Chapter 9 


Queueing Theory 


9.1 INTRODUCTION 


Queueing theory deals with the study of queues (waiting lines). Queues abound in practical situ- 
ations. The earliest use of queueing theory was in the design of a telephone system. Applications of 
queueing theory are found in fields as seemingly diverse as traffic control, hospital management, and 
time-shared computer system design. In this chapter, we present an elementary queueing theory. 


9.2 QUEUEING SYSTEMS 
A. Description: 


A simple queueing system is shown in Fig. 9-1. Customers arrive randomly at an average rate of 
A,- Upon arrival, they are served without delay if there are available servers; otherwise, they are 
made to wait in the queue until it is their turn to be served. Once served, they are assumed to leave 
the system. We will be interested in determining such quantities as the average number of customers 
in the system, the average time a customer spends in the system, the average time spent waiting in the 


queue, etc. 
Arrivals Departures 
Queue Service 


Fig. 9-1 A simple queueing system. 


The description of any queueing system requires the specification of three parts: 


The arrival process 
The service mechanism, such as the number of servers and service-time distribution 


The queue discipline (for example, first-come, first-served) 


B. Classification: 


The notation A/B/s/K is used to classify a queueing system, where A specifies the type of arrival 
process, B denotes the service-time distribution, s specifies the number of servers, and K denotes the 
capacity of the system, that is, the maximum number of customers that can be accommodated. If K is 
not specified, it is assumed that the capacity of the system is unlimited. For example, an M/M/2 
queueing system (M stands for Markov) is one with Poisson arrivals (or exponential interarrival time 
distribution), exponential service-time distribution, and 2 servers. An M/G/1 queueing system has 
Poisson arrivals, general service-time distribution, and a single server. A special case is the M/D/1 
queueing system, where D stands for constant (deterministic) service time. Examples of queueing 
systems with limited capacity are telephone systems with limited trunks, hospital emergency rooms 
with limited beds, and airline terminals with limited space in which to park aircraft for loading and 
unloading. In each case, customers who arrive when the system is saturated are denied entrance and 
are lost. 


C. Useful Formulas 


Some basic quantities of queueing systems are 
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the average number of customers in the system 

the average number of customers waiting in the queue 

the average number of customers in service 

the average amount of time that a customer spends in the system 

the average amount of time that a customer spends waiting in the queue 
the average amount of time that a customer spends in service 


RSH e 


Many useful relationships between the above and other quantities of interest can be obtained by 
using the following basic cost identity: 

Assume that entering customers are required to pay an entrance fee (according to some rule) to 
the system. Then we have 


Average rate at which the system earns = A, x average amount an entering customer (9.1) 
pays where A, is the average arrival rate of entering customers 


a, = lim 2 


‘a 
'7 0 


and X(t) denotes the number of customer arrivals by time t. 
If we assume that each customer pays $1 per unit time while in the system, Eq. (9.1) yields 
L=i,w (9.2) 
Equation (9.2) is sometimes known as Little’s formula. 


Similarly, if we assume that each customer pays $1 per unit time while in the queue, Eq. (9.1) 
yields 


L, = 4,W, (9.3) 
If we assume that each customer pays $1 per unit time while in service, Eq. (9.1) yields 
L, = 1,W, (9.4) 


Note that Eqs. (9.2) to (9.4) are valid for almost all queueing systems, regardless of the arrival process, 
the number of servers, or queueing discipline. 


9.3. BIRTH-DEATH PROCESS 


We say that the queueing system is in state S, if there are n customers in the system, including 
those being served. Let M(t) be the Markov process that takes on the value n when the queueing 
system is in state S, with the following assumptions: 


1. If the system is in state S,, it can make transitions only to S,_,; or S,.,, 1 > 1; that is, either a 
customer completes service and leaves the system or, while the present customer is still being 
serviced, another customer arrives at the system; from Sg, the next state can only be S,. 

2. If the system is in state S, at time ¢t, the probability of a transition to S,., in the time interval 
(t,t + At) is a, At. We refer to a, as the arrival parameter (or the birth parameter). 

3. If the system is in state S, at time ¢, the probability of a transition to S,_, in the time interval 
(t, 2 + At)is d, At. We refer to d, as the departure parameter (or the death parameter). 


The process N(t) is sometimes referred to as the birth-death process. 
Let p,(t) be the probability that the queueing system is in state S, at time ¢; that is, 


P,(t) = P{N(t) = n} (9.5) 
Then we have the following fundamental recursive equations for N(t) (Prob. 9.2): 
P,At) = (a, + d,)PpAt) + Gy —1Py- (0) + d,s 1Pn+ (8) ne 1 (9 6) 


Polt) = — (ao + do)polt) + dip, (t) 
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Assume that in the steady state we have 


lim p,(t) = Pn (9.7) 


{> oo 
and setting po(t) and p,(t) = 0 in Eqs. (9.6), we obtain the following steady-state recursive equation: 
(a, + da)Pn = Qn-1Pr-\ + Gy 1Pa+t n> I (9.8) 
and for the special case with dy = 0, 
do Po = 41 Py (9.9) 


Equations (9.8) and (9.9) are also known as the steady-state equilibrium equations. The state transition 
diagram for the birth-death process is shown in Fig. 9-2. 


4% a, a, ot 4, 
qd, d, dy, ayy 
Fig. 9-2 State transition diagram for the birth-death process. 


Solving Eqs. (9.8) and (9.9) in terms of py, we obtain 


a 
Pi = 7 Po 
Ug a 
Pz= 1d. Po 
m (9.10) 
_ 494, 77° Ay-1 
n d,d, . d, Po 
where pg can be determined from the fact that 
7 a, 40% 
,=(1s 24 4.) =1 (9.11 
Ze ( d,* dd," )”° 


provided that the summation in parentheses converges to a finite value. 


9.4 THE M/M/1 QUEUEING SYSTEM 


In the M/M/1 queueing system, the arrival process is the Poisson process with rate 4 (the mean 
arrival rate) and the service time is exponentially distributed with parameter y (the mean service rate). 
Then the process N(t) describing the state of the M/M/1 queueing system at time t is a birth-death 
process with the following state independent parameters: 


a, =A n>0 d, = yu n>1 (9.12) 
Then from Eqs. (9.10) and (9.11), we obtain (Prob. 9.3) 


po=1-4=1~p (9.13) 


A Qn 
Pr = (1 - 2\(2) = (1 —p)p" (9.14) 


where p = A/u < 1, which implies that the server, on the average, must process the customers faster 
than their average arrival rate; otherwise the queue length (the number of customers waiting in the 
queue) tends to infinity. The ratio p = A/u is sometimes referred to as the traffic intensity of the 
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system. The traffic intensity of the system is defined as 


mean service time mean arrival rate 
mean interarrival time mean service rate 


Traffic intensity = 


The average number of customers in the system is given by (Prob. 9.4) 


p A 
= = 915 
lp pod (9.15) 
Then, setting 2, = 4 in Eqs. (9.2) to (9.4), we obtain (Prob. 9.5) 
1 1 
= (9.16) 
B-A wl ~ p) 
A p 
W, = = (9.17) 
* wu — A) wl — p) 
2 2 
L,-—~—--— (9.18) 


9.5 THE M/M/s QUEUEING SYSTEM 


In the M/M/s queueing system, the arrival process is the Poisson process with rate A and each of 
the s servers has an exponential service time with parameter y. In this case, the process N(t) describ- 
ing the state of the M/M/s queueing system at time ¢ is a birth-death process with the following 
parameters: 


nu O<n<s 


a, =A n2>0 i= 
Su n2s 


(9.19) 


Note that the departure parameter d@, is state dependent. Then, from Eqs. (9.70) and (9.11), we obtain 
(Prob. 9.10) 


=! (sp) (spy? 
_ ‘sp) | _ Spy 2 
Po [> n * S10 ~p) 0.20) 
Ss n 
) Po n<s 
= ; 2 
Pa ars: (9.21) 
3! Po n2s 


where p = A/(su) < 1. Note that the ratio p = 2/(sy) is the traffic intensity of the M/M/s queueing 
system. The average number of customers in the system and the average number of customers in the 
queue are given, respectively, by (Prob. 9.12) 


A p(sp) 
~ + s!(1 — p) Po (9.22) 
p(spy A 
L, = ———_ = -_—— . 
© ppm a (723) 
By Eqs. (9.2) and (9.3), the quantities W and W, are given by 
EL (9.24 
=5 .24) 
L | 
= i = Ww _ 
W, 7 , (9.25) 
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9.6 THE M/M/1/K QUEUEING SYSTEM 


In the M/M/1/K queueing system, the capacity of the system is limited to K customers. When the 
system reaches its capacity, the effect is to reduce the arrival rate to zero until such time as a cus- 
tomer is served to again make queue space available. Thus, the M/M/1/K queueing system can be 
modeled as a birth-death process with the following parameters: 


A Osn<k d > | (9.26) 
a= = . 
° 0 n>K nF in 
Then, from Eqs. (9.10) and (9.11), we obtain (Prob. 9.14) 
__!-(/y) _ _l-p 
Po=T Gye) Tape PI 027) 
A\" (1 — p)p” 
r= (2) 95 = Hae n=1,...,K (9.28) 


where p = A/y. It is important to note that it is no longer necessary that the traffic intensity p = A/p 
be less than 1. Customers are denied service when the system is in state K. Since the fraction of 
arrivals that actually enter the system is 1 — p,, the effective arrival rate is given by 


A, = A(1 — px) (9.29) 


The average number of customers in the system is given by (Prob. 9.15) 


1—(K + 1)p* + Kp**! A 
L = ————__._O.}M—oMN aww = 9.30 
(1 — pl — pX*) p i ( ) 
Then, setting A, = A, in Eqs. (9.2) to (9.4), we obtain 
L L 
W= = 7 (9.31) 
A, A(t — px) 
1 
W,=W-- (9.32) 
H 
L, = A.W, = ACL — px)W, (9.33) 


9.7 THE M/M/s/K QUEUEING SYSTEM 


Similarly, the M/M/s/K queueing system can be modeled as a birth-death process with the fol- 
lowing parameters: 


A O<n<K ny O<n<s 
= = 9. 
@n {, n>K . \m n>s (9.34) 
Then, from Eqs. (9.10) and (9.17), we obtain (Prob. 9.17) 
sol (spy (spy 1 — pk-sti -1 
= — ————_— 9.35 
Po LS, n! t s} l—p (9.35) 
Po n<s 
Pn =) ngs (9.36) 
p Po s<n<kK 


where p = A/(su). Note that the expression for p, is identical in form to that for the M/M/s system, 
Eq. (9.21). They differ only in the py term. Again, it is not necessary that p = A/(sy) be less than 1. The 
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average number of customers in the queue is given by (Prob. 9.18) 


L = p(psy 
a= Po Sr — pp? 


The average number of customers in the system is 


{1 —[1 + (1 — pK ~— s)]p*~} 


A A 
L=L,+—~=L,+—(l — px) 
BH BH 


The quantities W and W, are given by 


L 
W=—-=L _ 
A, ar 
L L 
w=-—-= 4 
7 A, ML — px) 


Solved Problems 


9.1. | Deduce the basic cost identity (9.1). 
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(9.37) 


(9.38) 


(9.39) 


(9.40) 


Let T be a fixed large number. The amount of money earned by the system by time T can be com- 
puted by multiplying the average rate at which the system earns by the length of time T. On the other 
hand, it can also be computed by multiplying the average amount paid by an entering customer by the 
average number of customers entering by time T, which is equal to A,T, where A, is the average arrival rate 


of entering customers. Thus, we have 


Average rate at which the system earns x T = average amount paid by an entering customer x (A,T) 


Dividing both sides by T (and letting T > oo), we obtain Eq. (9.1). 


9.2, Derive Eq. (9.6). 


From properties 1 to 3 of the birth-death process N(t), we see that at time t + At, the system can be in 


state S, in three ways: 


1. By being in state S, at time ¢ and no transition occurring in the time interval (t, t + At). This happens 
with probability (1 —a,As(i—d,At)=1-—(a,+d,) At [by neglecting the second-order effect 


a, d,(At)?}. 


2. By being in state S,_, at time t and a transition to S, occurring in the time interval (t, ¢ + At). This 


happens with probability a,_ , At. 


3. By being in state $,,, at time ¢ and a transition to S, occurring in the time interval (¢, ¢ + At). This 


happens with probability d,, , At. 
Let p(t) = PLN(t) = 4 
Then, using the Markov property of N(t), we obtain 


p,{t + At) =[1 —(a, + d,) Atlp,(t) + a,-, At pr-s(t) + d,41 At pas i(O) 


Polt + At) = [1 — (ay + do) At}po(t) + d, At p,(t) 
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Rearranging the above relations 
Palt + At) — p,(t) 
At 
Polt + At) — Polt) _ 
At 


= — (a, + d,)p,(t) + Gy—1Pa- y(t) + das Pao a(t) n21 


— (ag + do)po(t) + di pile) 


Letting At — 0, we obtain 


Pt) = — (4, + dapat) + n—1 Pail) + dae iPr) nel 
Polt) = — (aq + do)polt) + 4p, (8) 


9.3. Derive Eqs. (9.13) and (9.14). 
Setting a, = A, dy) = 0, and d, = win Eq. (9.10), we get 


A 
P, =— Po = PPo 
mM 


Ke a 1 
> Pn = Po >. 2” = Py —— = 1 lp} <1 
n=0 n=0 {—p 
from which we obtain 
A 
Po=l1—pH=l—-- 
m 


A\" A\fAa\" 
Pr = (?) Po = (1 ~ pp" = (: - 2\(2) 
H H/)\e 
9.4. Derive Eq. (9.15). 


Since p, is the steady-state probability that the system contains exactly n customers, using Eq. (9./4), 
the average number of customers in the M/M/I queueing system is given by 


a) ool 


L= ¥np,= ¥ all — pp" =(1 ~ 0) ne" (9.41) 


n=0 n=0 


where p = A/p < 1. Using the algebraic identity 


¥ nx" -a-» Ix} <1 (9.42) 


we obtain 


9.5. Derive Eqs. (9.16) to (9.18). 
Since 2, = A, by Eqs. (9.2) and (9.15), we get 
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9.6. 


9.7. 
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which is Eq. (9.16). Next, by definition, 
W,=Ww-W, (9.43) 
where W, = 1/y, that is, the average service time. Thus, 
! l A p 
“pak pw mA ul —p) 
which is Eq. (9.17). Finally, by Eq. (9.3), 


4 


Az p? 


L,=AW, = = — 
ad) T= 


which is Eq. (9.18). 


Let W, denote the amount of time an arbitrary customer spends in the M/M/1 queueing system. 
Find the distribution of W,. 


We have 


ow 
P{W, <a} = Y P{W, <a|n in the system when the customer arrives} 
n=0 


x P{n in the system when the customer arrives} (9.44) 


where n is the number of customers in the system. Now consider the amount of time W, that this customer 
will spend in the system when there are already n customers when he or she arrives. When n = 0, then 
W, = W,,,, that is, the service time. When n > I, there will be one customer in service and n — 1 customers 
waiting in line ahead of this customer’s arrival. The customer in service might have been in service for some 
time, but because of the memoryless property of the exponential distribution of the service time, it follows 
that (see Prob. 2.57) the arriving customer would have to wait an exponential amount of time with param- 
eter uw for this customer to complete service. In addition, the customer also would have to wait an exponen- 
tial amount of time for each of the other n — | customers in line. Thus, adding his or her own service time, 
the amount of time W, that the customer will spend in the system when there are already n customers when 
he or she arrives is the sum of n + | iid exponential r.v.’s with parameter yz. Then by Prob. 4.33, we see that 
this r.v. is a gamma r.v. with parameters (n + 1, 4). Thus, by Eq. (2.83), 


. . a { ts 
P{W, < a|n in the system when customer arrives} = ( pe" ay dt 
0 n! 


From Eq. (9.14), 


A\f AN" 
P{n in the system when customer arrives} = p, = (: - “\(2) 
B 


Hence, by Eq. (9.44), 


Leen Se OSG) «| 
Fy, = PiW, sa} = “(1 —--)\-] a 
w= PE “ x fm n! B/\B 
7 a aut oar (As)" 
= AC Aje x 7 d 
= [uae dt = 1 ~e7#- 4s (9.45) 
0. 


Thus, by Eq. (2.79), W, is an exponential r.v. with parameter wp — A. Note that from Eq. (2.99), E(W,) = 
1f{u — 4), which agrees with Eq. (9.16), since W = E(W,). 


Customers arrive at a watch repair shop according to a Poisson process at a rate of one per 
every 10 minutes, and the service time is an exponential r.v. with mean 8 minutes. 


(a) Find the average number of customers L, the average time a customer spends in the shop 
W, and the average time a customer spends in waiting for service W,. 
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9.8. 


9.9. 


(b) Suppose that the arrival rate of the customers increases 10 percent. Find the corresponding 
changes in L, W, and W,, 


(a) The watch repair shop service can be modeled as an M/M/1 queueing system with 4 = 75, p= 
from Eqs. (9.15), (9.16), and (9.43), we have 


oa 
4 
> 
S 
¥ 


A 1 
L=——=_% =4 
u-A g-55 
Ww 40 minut 
= ——- = —— = 40 minutes 
u-A 4-% 
W, = W — W, = 40 — 8 = 32 minutes 
(b) NowA= 3, p= 4. Then 
A 1 
L=—— =," =8 
un-A gn 
I I . 
Wena ag 7 minutes 


W, = W — W = 72 — 8 = 64 minutes 


It can be seen that an increase of 10 percent in the customer arrival rate doubles the average number 
of customers in the system. The average time a customer spends in queue is also doubled. 


A drive-in banking service is modeled as an M/M/1 queueing system with customer arrival rate 
of 2 per minute. It is desired to have fewer than 5 customers line up 99 percent of the time. How 


fast should the service rate be? 


From Eq. (9.14), 
. a wu 5 A 
P{5 or more customers in the system} = ) p, = ) (1 — p)p"=p p=- 
n=5 n=5 Kh 
In order to have fewer than 5 customers line up 99 percent of the time, we require that this probability be 


less than 0.01. Thus, 


from which we obtain 


As 2° 
5> = = 320 024 
M 0.01 0.01 3200 °r ues 


Thus, to meet the requirements, the average service rate must be at least 5.024 customers per minute. 


People arrive at a telephone booth according to a Poisson process at an average rate of 12 per 
hour, and the average time for each call is an exponential r.v. with mean 2 minutes. 


(a) What is the probability that an arriving customer will find the telephone booth occupied? 


(b) It is the policy of the telephone company to install additional booths if customers wait an 
average of 3 or more minutes for the phone. Find the average arrival rate needed to justify a 


second booth. 


(a) The telephone service can be modeled as an M/M/1 queueing system with A= 4, p= 4, and p= 


Aju = 2. The probability that an arriving customer will find the telephone occupied is P(L > 0), where 
L is the average number of customers in the system. Thus, from Eq. (9.13), 


P(L>0)=1-—pyp=1—-(l—p)=p =%=04 
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9.11. 
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(b) From Eq. (9.17), 
A A 
W,= =-————— > 3 
7 wwe— A) 0,5(0.5 — A) 


from which we obtain A > 0.3 per minute. Thus, the required average arrival rate to justify the second 
booth is 18 per hour. 


Derive Eqs. (9.20) and (9.21). 
From Egg. (9.19) and (9.10), we have 


TI ; (4) n< (9.46) 
= Toh ~~} Ss . 
Pr Potl (k + Dp Po zi) n! 
sol A nai Q (2) 1 
= — = Pol—}] Jas n>s 9.47 
Pn Poll ein I sp Po pb) sist (9.47) 
Let p = Aj(sp). Then Eqs. (9.46) and (9.47) can be rewritten as 
(sp) Py ons 
n! 
Pr = 
“se 
° Po n>s 


Using the summation formula 


|x[<1 (9.48) 
we obtain Eq. (9.20); that is, 
_ 5! (spy gh % , ~3 bea (sp) i 
ro [Z Hh") Lam tsa a 


provided p = A/(sp) < 1. 


Consider an M/M/s queueing system. Find the probability that an arriving customer is forced to 
join the queue. 


An arriving customer is forced to join the queue when all servers are busy—that is, when the number 
of customers in the system is equal to or greater than s. Thus, using Eqs. (9.20) and (9.21), we get 


. . x FS wo (spy 
P(a customer is forced to join queue) = ) p, = Po > > p" = Po sl —p) 
a=s Sines Sy —?p 
(sp) 
-— Sills p) (9.49) 


S (sp)’ n (spy 


om st(l—o) 


Equation (9.49) is sometimes referred to as Erlang's delay (or C) formula and denoted by C(s, A/u). Equation 
(9.49) is widely used in telephone systems and gives the probability that no trunk (server) is available for an 
incoming call (arriving customer) in a system of s trunks. 


Derive Eqs. (9.22) and (9.23). 
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Equation (9.21) can be rewritten as 


(sp) < 
ni Po n AY 
P, = 
mss 
p Po n>s 


Then the average number of customers in the system is 


L= E(N) = ne Y ne 


Using the summation formulas, 


ye = a3 |x| <1 (9.50) 
k kel h 
pr ese" = Ixt<] (9.51) 


and Eq. (9.20), we obtain 
(sey ={ p__ plsp**! —(s + l)p* + ot) 


L=o(o'y, & (pF (0 — py 
{ ib ! (soy | an 4, ples) } 
Poy nao f! + st(1 -- p)? 


plspy 
= Po|se 5+ | 
a 
~+ 


plspy _ psp)’ 
sil — prop sil — pp? 


Next, using Eqs. (9.21) and (9.50), the average number of customers in the queue is 


=sp t+ 


Pel 


L,= Lin 9p,= Lin ae 


=s 


= Po ser Ln — sp" = 


py SO 


_ __p(sp)* _ 
si(1 — p)? Po 


9.13. A corporate computing center has two computers of the same capacity. The jobs arriving at the 
center are of two types, internal jobs and external jobs. These jobs have Poisson arrival times 
with rates 18 and 15 per hour, respectively. The service time for a job is an exponential r.v. with 
mean 3 minutes. 

(a) Find the average waiting time per job when one computer is used exclusively for internal 
jobs and the other for external jobs. 
(b) Find the average waiting time per job when two computers handle both types of jobs. 


(a) When the computers are used separately, we treat them as two M/M/] queueing systems. Let W,, and 
W,, be the average waiting time per internal job and per external job, respectively. For internal jobs, 
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A, = $= fy and p, = 4. Then, from Eq. (9.16), 


For external jobs, 4, = 43 = 4 and pp, = 4, and 


[CHAP. 9 


(6b) When two computers handle both types of jobs, we model the computing service as an M/M/2 


queueing system with 


18415 1 1 4 33 
“60 -20" B<3 P~3n 40 
Now, substituting s = 2 in Eqs. (9.20), (9.22), (9.24), and (9.25), we get 
(2p)? - l-p 
=|1+2p+ = 
Po | o* 21= p) l+p 
4p° —*) 2p 
L=29 +—— (—* j= 
pee (st 1p 
Lt 12 1 
wut! 2% 1 


(9.52) 


(9.53) 


(9.54) 


Thus, from Eq. (9.54), the average waiting time per ob when both computers handle both types of jobs 


is given by 


Ww, = _ 2438) _ = 6.39 min 
$601 — (35)] 


From these results, we see that it is more efficient for both computers to handle both types of jobs. 


Derive Eqs. (9.27) and (9.28). 


From Eqs. (9.26) and (9.10), we have 


an . 
Pn =\7 } Po = Po O<n<k 


From Eq. (9.11), po is obtained by equating 


n=0 n=0 
Using the summation formula 
K 1 _ xk rd 
ue “Tox 
we obtain 
l-p (1 — p)p" 
Pos yk and mT pee 


Note that in this case, there is no need to impose the condition that p = A/p < 1. 


Derive Eq. (9.30). 


(9.55) 


(9.56) 


CHAP. 9] QUEUEING THEORY 293 


Using Eggs. (9.28) and (9.57), the average number of customers in the system is given by 


K K 
L = EN) = Y. mpy==——eey Yn" 
n=0 p n=0 
l—p felKp**'! —(K + ljp* +1] 
S| (I= pp } 
1 —(K + Ip* + Kp**! 


(= pXl = pF) 


9.16. Consider the M/M/1/K queueing system. Show that 


L,=L—(1 — po) (9.57) 
1 
/ =~ (9.58) 
Lu 
1 
=-(L+1) (9.59) 
Lu 


In the M/M/1/K queueing system, the average number of customers in the system is 


K K 
L=E(N)= Y np, — and YP, =! 
n=0 a=0 
The average number of customers in the queue is 
K K 


K 

L, = E(N,) = Y (n- 1)p, = > Py — > pyp=Lb—-(l — Po) 
a=1 n=0 n= 

A customer arriving with the queue in state S, has a wait time T, that is the sum of n independent exponen- 

tial r.v.’s, each with parameter yp. The expected value of this sum is n/p [Eq. (4.108)]. Thus, the average 

amount of time that a customer spends waiting in the queue is 


K n K 


1 l 
W, = E(T)) = —pP,=—- np, =-L 
‘ % arr 1m, p 


Simililarly, the amount of time that a customer spends in the system is 


K K K 
w=ET)= 3 med =1 (Sant Yr )=2tL +1) 


n=l H =0 n=0 


Note that Eqs. (9.57) to (9.59) are equivalent to Eqs. (9.37) to (9.33) (Prob. 9.27). 


9.17. Derive Eqs. (9.35) and (9.36). 
As in Prob. 9.10, from Eqs. (9.34) and (9.10), we have 


Tl A @) n< (9.60) 
= = s . 
Pa = Polk+ be Xu) at 
s-1 A asl A @) 1 
P, = —_ _—= -) = s<n<k (9.61) 
poll (k + l)u 2. sa Xu} sts 
Let p = A/(sp). Then Eqs. (9.60) and (9.61) can be rewritten as 
(sp) Do ns 
n! 
P, ~~ pss 


7 Po SSnskK 
s! 
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which is Eq. (9.36). From Eq. (9.11), pp is obtained by equating 


(sp)” K p's’ 
mts |- 


. a! 


so] 


K s 
vp, = rs > 
n=0 n=0 


Using the summation formula (9.56), we obtain 


x) (sp)" s K , “ ‘) -4 
[ore Be 
_ s 1 (sp) (spy ~— p* = 
[Ear nm” silo) 


which is Eq. (9.35). 


9.18. Derive Eq. (9.37). 


Using Eq. (9.36) and (9.51), the average number of customers in the queue is given by 


S K 
Po = ¥ (a syp" 
Si n=s 


s K- 
Po (ot — sjp" * = poy sf) y mp” 
' m=0 
= toy pik — apt ot (Ks a 
= Po s! (oP 
= pp —PUSOY gy _ oVK — s\1ok-# 
= Po Tq — pel Uh + (l= eXK ~ Met) 


9.19, Consider an M/M/s/s queueing system. Find the probability that all servers are busy. 


Setting K = s in Eqs. (9.60) and (9.61), we get 


and pg is obtained by equating 


Thus 


The probability that all servers are busy is given by 
_ AV 1 
Ps = Po H s 


_(A/py ys) 


Lain! 
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(9.62) 


(9.63) 


(9.64) 


Note that in an M/M/s/s queueing system, if an arriving customer finds that all servers are busy, the 
customer will turn away and is lost. In a telephone system with s trunks, p, is the portion of incoming calls 
which will receive a busy signal. Equation (9.64) is often referred to as Erlang’s loss (or B) formula and is 


commonly denoted as B(s, 2/p). 


9.20. 


An air freight terminal has four loading docks on the main concourse. Any aircraft which arrive 


when all docks are full are diverted to docks on the back concourse, The average aircraft arrival 
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9.21. 


9.22. 


rate is 3 aircraft per hour. The average service time per aircraft is 2 hours on the main concourse 
and 3 hours on the back concourse. 
(a) Find the percentage of the arriving aircraft that are diverted to the back concourse. 


(b) Ifa holding area which can accommodate up to 8 aircraft is added to the main concourse, 
find the percentage of the arriving aircraft that are diverted to the back concourse and the 
expected delay time awaiting service, 


(a) The service system at the main concourse can be modeled as an M/M/s/s queueing system with s = 4, 
A=3, n= 4, and A/u = 6. The percentage of the arriving aircraft that are diverted to the back con- 


course 1S 


100 x P(all docks on the main concourse are full) 


From Eq. (9.64), 


64/4! 54 
P(all docks on the main concourse are full) = p, = oi =— x 0.47 
, 115 
Y (6a) 
n=0 


Thus, the percentage of the arriving aircraft that are diverted to the back concourse is about 47 
percent. 

(b) With the addition of a holding area for 8 aircraft, the service system at the main concourse can now be 
modeled as an M/M/s/K queueing system with s = 4, K = 12, and p = A/(sz) = 1.5. Now, from Eqs. 


(9.35) and (9.36), 
3 6" 6+ i) -1 
= =+> x 0.00024 
Po [soe (ES) 00 


1.51244 
Pi2= “a Po 0.332 


Thus, about 33.2 percent of the arriving aircraft will still be diverted to the back concourse. 
Next, from Eq. (9.37), the average number of aircraft in the queue is 
1.5(6%) 


L, = 0.00024 


Then, from Eq. (9.40), the expected delay time waiting for service is 


fy 6.0565 _ ~ 3.022 hours 


WwW = = 
7 Ml —py2) 31 — 0.332) 


Note that when the 2-hour service time is added, the total expected processing time at the main 
concourse will be 5.022 hours compared to the 3-hour service time at the back concourse. 


Supplementary Problems 


Customers arrive at the express checkout lane in a supermarket in a Poisson process with a rate of 15 per 
hour. The time to check out a customer is an exponential r.v. with mean of 2 minutes. 


(a) Find the average number of customers present. 
(b) What is the expected idle delay time experienced by a customer? 
(c) What is the expected time for a customer to clear a system? 


Ans. (a) 1; (b) 2 min; (c) 4 min 


Consider an M/M/1 queueing system. Find the probability of finding at least k customers in the system. 
Ans, p* = (A/p). 
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In a university computer center, 80 jobs an hour are submitted on the average. Assuming that the computer 
service is modeled as an M/M/1 queueing system, what should the service rate be if the average turnaround 
time (time at submission to time of getting job back) is to be less than 10 minutes? 


Ans. 1.43 jobs per minute 


The capacity of a communication line is 2000 bits per second. The line is used to transmit 8-bit characters, 
and the total volume of expected calls for transmission from many devices to be sent on the line is 12,000 
characters per minute. Find (a) the traffic intensity, (6) the average number of characters waiting to be 
transmitted, and (c) the average transmission (including queueing delay) time per character. 


Ans. (a) 0.8; (b) 3.2; (c) 20ms 


A bank counter is currently served by two tellers. Customers entering the bank join a single queue and go 
to the next available teller when they reach the head of the line. On the average, the service time for a 
customer is 3 minutes, and 15 customers enter the bank per hour. Assuming that the arrivals process is 
Poisson and the service time is an exponential r.v., find the probability that a customer entering the bank 
will have to wait for service. 


Ans, 0.205 


A post office has three clerks serving at the counter. Customers arrive on the average at the rate of 30 per 
hour, and arriving customers are asked to form a single queue. The average service time for each customer 
is 3 minutes. Assuming that the arrivals process is Poisson and the service time is an exponential r.v., find 
(a) the probability that all the clerks will be busy, (b) the average number of customers in the queue, and (c) 
the average length of time customers have to spend in the post office. 


Ans. (a) 0.237; (b) 0.237; (¢) 3.947 min 


Show that Eqs. (9.57) to (9.59) and Eqs. (9.31) to (9.33) are equivalent. 
Hint: Use Eq. (9.29), 


Find the average number of customers L in the M/M/I/K queueing system when 2 = yp. 
Ans. K/2 


A gas station has one diesel fuel pump for trucks only and has room for three trucks (including one at the 
pump). On the average trucks arrive at the rate of 4 per hour, and each truck takes 10 minutes to service. 
Assume that the arrivals process is Poisson and the service time is an exponential r.v. 

(a) What is the average time for a truck from entering to leaving the station? 

(b) What is the average time for a truck to wait for service? 

(c) What percentage of the truck traffic is being turned away? 


Ans. (a) 20.15 min; (b) 10.14 min; (c) 12.3 percent 
Consider the air freight terminal service of Prob. 9.20. How many additional docks are needed so that at 


least 80 percent of the arriving aircraft can be served in the main concourse with the addition of holding 
area? 


Ans. 4 


Appendix A 


Normal Distribution 


1 Zz 
P(z)=—= | e "? de 
(2) mm I. 


®(—z) = 1 — (2) 


Fig. A 
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Table A—Continued 


The material below refers to Fig. A. 


a 0.2 0.1 0.05 0025 001 0.005 
Zn +~=«d«w282es—id«ASS —Ssi9HDs«2.240 2.576 2.807 


Appendix B 


Fourier Transform 


Bl CONTINUOUS-TIME FOURIER TRANSFORM 


Definition: 


Table B-1 Properties of the Continuous-Time Fourier Transform 


x(t) X(o) 

x,(t) X (a) 

x,(t) X (a) 
Linearity a,x, (t) + a, x(t) a,X ,(w) + a, X,(w) 
Time shifting x(t — fo) e 1X (@) 
Frequency shifting eloty(t) X(@ — @) 


. 1 
Time scaling x(at) Tal X (2 


Time reversal x(—p) X(—@) 
Duality X(t) 21x(— a) 
dx(t) 


at j@X(w) 


Time differentiation 


aX 
Frequency differentiation (—jt)x(t) a 


Integration | x(t) dt 1X (0)5(w) + a X(@) 


— 0 
00 


Convolution X(t) * x,(t) = | X4(t)x,(t — 1) dt X,(@)X (a) 
1 


5 X,(@) * X,(@) 
™ 


Multiplication X4(1)x2(t) 


-1 [ X,(A)X(@ — a) da 
2H J 

Real signal x(t) = x(t) + x,(t) X(@) = A(@) + fB(w) 

X(—@) = X*(a) 

Even component x,(t) Re{X(w)} = A(@) 

Odd component x,(t) j Im{X(@)} = jB(w) 


co) 1 i: ) 
Parseval’s theorem | x(t) |? de = mn { | X(@) |? dw 
1 -*o 
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Table B-2 Common Continuous-Time Fourier Transform Pairs 


6(t) 
5(t — to) e ito 
1 2n5(w) 
eivot 225(@ — Wo) 
COS Wot n[d(w@ — wo) + 5(w + Wo)] 
Sin Wot —jn[d(w@ — wy) — d(w + &o)] 


1 t>0 ! 
“t= 44 <0 m3(w) + — 


e- “u(t) a>0O 


* 2n 
@ Y Aa — kao), Wo = - 
k=~o@ 


B22. DISCRETE-TIME FOURIER TRANSFORM 


Definition: 


X(Q) = y x(n)e~ 12" x(n) = 7 [ X(Q)e* dO 


n=-00 - 


Table B-3_ Properties of the Discrete-Time Fourier Transform 


x(n) X(Q) 
x,(n) X (Q) 


x(n) XQ) 
Periodicity x(n) X(Q + 22) = X(Q) 
Linearity a,X,(n) + a, x,(n) a, X (Q) + a, X (Q) 
Time shifting x(n — No) e170 X(Q) 
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Table B-3—Continued 


30" x(n) 


x(—n) 


Frequency shifting 
Time reversal 


Frequency differentiation nx(n) 


n 


Accumulation Dd x(k) 
k=-a 
Convolution X1(n) *# x(n) = VY x,(k)x2(n — k) 
k=-@ 
Multiplication X(n)x,(n) 


Real sequence x(n) = x,(n) + x,(n) 


Even component 
Odd component 


x(n) 
Xq(N) 


1 nd 
ix? = |” Lxtane an 


x 


» 


a= 


Parseval’s theorem 
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X(Q — 9) 
X(—Q) 
_aX(Q) 


dQ 


1X(0)8(Q) + 7 xQ) 


em 
X (Q)X (Q) 


1 
on X (Q) @ X,(Q) 
ua 


1 
al 
X(Q) = A(Q) + jBO) 

X(—9) = X*(Q) 
Re{X(Q)} = A(Q) 
j Im{X(Q)} = jBQ) 


. X (A)X (Q — a) da 


« 


Table B-4 Common Discrete-Time Fourier Transform Pairs 


x[n] 
i n=0 
6(n) = 1 
tn) {0 n #0 
2. 5(n — no) e J2n0 
3. x(n) = 1 276(Q) 
4. eiton 276(Q — OQ) 
5. cos Nyon n[6(Q — Qo) + 6(Q + OX) 
6. sin Qgn —jrLdQ — Q_) ~ 6Q + OQ) 
| n>0 
. = 7 6b - 
y un) {; n<O moO) + 1—e 8 
1 
8. a’u(n) ja|<1 Toaeo® 
t 
9. (n+ Ia u(n) la[< 1 (l—ae mye 
10 |n| jal <1 _i-@ 
, “ 1 — 2a cos Q + a? 
1 [nl sn, sinf(Q(N, + 4)] 
11, x(n) = at 
0 [n}> N, sin(Q/2) 
sin Wn 1 0<{Q|<W 
mn O<W<t x)= ft W<jQ\<n 
= ~ 2n 
> 5(n — KNo) Qo x 5(Q — kN) Q = 
kara k=- mn No 
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A 
a priori probability, 266 
a posteriori probability, 266 
Absorbing states, 168 
Absorption, 168 
probability, 168 
Acceptance region, 264 
Accessible states, 167 
Algebra of sets, 2-5, 12 
Alternative hypothesis, 264 
Aperiodic states, 168 
Arrival parameter, 282 
Arrival process, 170, 201 
Autocorrelation, 162, 210 
Autocovariance, 162 
Axioms of probability, 5-6 


B 
Bayes' 
estimator, 249 
estimation, 248, 255 
rule, 8 
test, 267 
theorem, 8 
Bernoulli 


distribution, 43 
experiment, 33 
process, 172 
rv., 43 
trials, 33 
Best estimator, 249 
Biased estimator, 251 
Binomial 
distribution, 44 
coefficient, 44 
r.v., 44 
Birth-death process, 282 
Bivariate 
normal distribution, 88 
r.v., 79, 89 
Bonferroni's inequality, 17 
Boole's inequality, 18 
Brownian motion process (see Wiener process) 
Buffon's needle, 103 


C 

Cauchy 
criterion, 221 
r.v., 77-78, 135 


Cauchy-Schwarz inequality, 108 

Central limit theorem, 47, 128-129, 155-156 
Chapman-Kolomogorov equation, 166 
Characteristic function, 127-128, 154 
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Chebyshev inequality, 66 
Chi-square (c2) r.v., 146 
Complement of set, 2 
Complex random process, 161-162, 218 
Composite hypothesis, 264 
Conditional 
distribution, 48, 71, 83, 104 
expectation, 85, 126 
mean, 85, 110 
probability, 7, 24 
probability density function (pdf), 83 
probability mass function (pmf), 83 
variance, 85, 110 
Confidence 
coefficient, 258 
interval, 258 
Consistent estimator, 248 
Continuity theorem of probability, 19 
Convolution, 137, 214 
integral, 214 
sum, 214 
Correlation, 85 
coefficient, 84-85, 107 
Counting process, 170 
Covariance, 84-85, 107 
matrix, 89 
Craps, 34 
Critical region, 264 
Cross-correlation, 211 
Cross power spectral density (or spectrum), 212 
Cumulative distribution function (cdf), 37 


D 

Decision test, 265, 271 
Bayes', 267, 274 
likelihood ratio, 266 
MAP (maximum a posteriori), 266 
maximum-likelihood, 265 
minimax (min-max), 267 
minimum probablity of error, 267 
Neyman-Pearson, 266, 272 

Decision theory, 264 

De Morgan's laws, 5 

Dirac d function, 213 

Disjoint sets, 3 


Distribution 
Bernoulli,, 43 
binomial, 44 


conditional, 48, 71, 83, 104 
exponential, 46 

first-order, 162 

limiting, 169 

normal (or gaussian), 47 
nth-order, 162 

Poisson, 44, 68 
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Distribution (continued) 
second-order, 162 
stationary, 169 
uniform, 45 

Distribution function, 37, 39, 49 
cumulative (cdf), 37 

Domain, 38 


E 
Efficient estimator, 247 
Ensemble, 161 
average, 162 
Equally likely events, 7, 20 
Ergodic, in the mean, 243 
process, 165 
Erlang's, delay (or C) formula, 290 
loss (or B) formula, 294 
Estimates, point, 247 
interval, 247 
Estimation, 247 
Bayes’, 255 
error, 249 
linear, 249 
mean square, 249 
maximum likelihood, 253 
mean square, 249 
parameter, 247 
Estimator, Bayes' 249 
best, 249 
biased, 251 
consistent, 248 
efficient, 247 
maximum-likelihood, 248 
minimum mean square error, 249 
point, 247, 250 
unbiased, 247 
Events, 2,9 
certain, 3 
elementary, 2 
equally likely, 7, 20 
impossible, 3 
independent, 8 
mutually exclusive, 6 
and exhaustive, 8 
null, 3 
Expectation, 42, 125 
conditional, 85, 126 
Expected value (see Mean) 
Experiment, Bernoulli, 33 
random, 1 
Exponential, distribution, 46 
r.v., 46 


F 
Fourier series, 216, 236 
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Fourier series (continued) 

Perseval's theorem for, 237 
Fourier transform, 218, 240, 299 
Functions of r.v.'s, 122-124, 129, 136, 144 


G 
Gambler's ruin, 190 
Gamma, function, 59 
r.v., 59, 145 
Gaussian distribution (see Normal distribution) 
Geometric r.v., 55, 62 


H 
Hypergeometric r.v., 76 
Hypothesis 
alternative, 264 
composite, 264 
null, 264 
simple, 264 
Hypothesis testing, 264-275, 268 
level of significance, 265 
power of, 265 


I 
Impulse response, 214 
Independent (statistically) 
events, 8 
increments, 163 
process, 163 
r.v.'s, 80-81, 83 
Interarrival process, 170 
Intersection of sets, 2 
Interval estimate, 247 


J 

Jacobian, 125 

Joint 
characteristic function, 127 
distribution function, 80, 89 
moment-generating function, 126 
probability density function (pdf), 82 
probability mass function (pmf), 82 


K 
Karhunen-Loeve expansion, 217, 231 


L 
Lagrange multiplier, 266 
Laplace r.v., 77 
Law of large numbers, 128, 155 
Level of significance, 265 
Likelihood 

function, 248 

ratio, 265 

test, 260 
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Limiting distribution, 169 
Linear mean-square estimation, 249 
Linear system, 213, 231 
impulse response of, 214 
response to random inputs, 213-215, 231 
Little's formula, 282 
Log-normal r.v., 134 


M 
MAP (maximum a posteriori) test, 266 
Marginal 
distribution function, 81 
cumulative distribution function (cdf), 81 
probability density function (pdf), 82 
probability mass function (pmf), 81 
Markov 
chains, 164 
discrete-parameter, 165-169, 185 
homogeneous, 165 
irreducible, 167 
nonhomogeneous, 165 
regular, 169 
inequality, 66 
matrix, 166 
process, 164, 183 
property, 74, 164 
Maximum likelihood estimator, 248 
Mean, 42 
Mean square 
continuity, 209 
derivative, 209 
error, 249 
minimum, 250 
estimation, 249 
linear, 249 
integral, 210 
periodicity, 216 
Median, 76 
Memoryless property (see Markov property) 
Mercer's theorem, 217 
Minimax (min-max) test, 267 
Minimum probability of error test, 267 
Minimum variance estimator, 248 
Mixed r.v., 41 
Mode, 76 
Moment, 42, 84 
Moment generating function, 126 
Most efficient estimator, 248 
Multinomial 
coefficient, 114 
distribution, 88 
theorem, 114 
trial, 88 
Multiple r.v., 79 
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Mutually exclusive 
events, 3, 6, 9 
and exhaustive events, 8 
sets, 3 


N 
Negative binomial r.v., 77 
Neyman-Pearson test, 266, 272 
Normal 
distribution, 47, 297 
bivariate, 88 
n-variate, 88 
process, 164, 184 
r.v., 47 
standard, 47 
Null 
event (set), 3 
hypothesis, 264 
recurrent state, 168 


O 

Orthogonal r.v., 85 
Orthogonality principle, 250 
Outcomes, | 


P 
Parameter estimation, 247 
Parameter set, 161 
Parseval's theorem, 237 
Periodic states, 168 
Point estimators, 247, 250 
Point of occurrence, 169 
Poisson, distribution, 44, 68 
process, 169-171 

r.v., 44 

white noise, 230 
Positive recurrent states, 168 
Posterior probability, 266 
Power 

function, 269 

of test, 265 
Power spectral density (or spectrum), 210-213, 225 
Prior probability, 266 
Probability, 1 

density function (pdf), 41 

mass function (pmf), 41 


measure, 5 
Q 
Queueing 

system, 281 


theory, 281 


Index 


R 
Random 
experiment, | 
process, 161 
complex, 161-162, 218 
independent, 163 
real, 161 
sample, 155, 247 
telegraph signal, 228 
semi, 227 
variable (r.v.), 38 
continuous, 41, 76, 82 
discrete, 41 
function of, 122 
mixed, 41 
vector, 79, 86 
walk, 173 
simple, 173, 183, 195 
Range, 38 


Rayleigh r.v., 59, 143 
Real random process, 161 
Recurrent states, 167 

null, 168 

positive, 168 
Regression line, 250 
Rejection region, 264 
Relative frequency, 5 
Renewal process, 170 


N) 
Sample 
function, 161 
mean, 128, 155 
point, | 
space, | 
variance, 262 
vector (see Random sample) 
Sets, 1 
algebra of, 2-5, 12 
disjoint, 3 
intersection of, 2 
mutually exclusive, 3 
union of, 2 
Simple 
hypothesis, 264 
random walk, 173, 183, 195 
Standard 
deviation, 43 
normal r.v., 47 
State space, 161 
States 
absorbing, 168 
accessible, 167 
aperiodic, 168 
periodic, 168 
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States (continued) 
recurrent, 167 
null, 168 
positive, 168 
transient, 168 
Stationary 
distributions, 169 
independent increments, 163 
processes, 163 
strict sense, 163 
wide sense (WSS), 163 
transition probability, 165 
Statistic, 247 
sufficient, 277 
Stochastic, continuity, 209 
derivative, 209 
integral, 210 
matrix, (see Markov matrix) 
periodicity, 216 
process, (see Random process) 
System, linear, 213 
linear time invariance (LTJ), 213-216 
response to random inputs, 213-216 
parallel, 33 
series, 12 


T 

Threshold value, 266 

Time-average, 165 

Time autocorrelation function, 165 

Total probability, 8 

Traffic intensity, 283-284 

Transient states, 168 

Transition probability, 165 
matrix, 165 
stationary, 165 

Type I error, 264 

Type II error, 264 


U 

Unbiased estimator, 247 

Uncorrelated r.v.'s, 85 

Uniform, distribution, 45 
rv., 45 

Union of sets, 2 

Unit, impulse function (see Dirac d function) 
impulse sequence, 213 
sample response, 214 
sample sequence, 213 

Universal set, 1 


Vv 

Variance, 42 
conditional, 85 

Vector mean, 89 


Index 


Index 
Venn diagram, 3 


WwW 
Waiting time, 202 
White noise, 213, 229 
normal (or gaussian), 229 
Poisson, 230 
Wiener-Khinchin relations, 211 
Wiener process, 172, 204 
standard, 172 
with drift coefficient, 172 
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